TAP information schema

Thu Oct 11 13:05:36 PDT 2007

> general case. Specifically, some RDBMSs require that the SQL contains the
> schema name (DB2, eg) on the front of every table name. I do not think
that
> ADQL requires this (maybe shouldn't) but as a site using such a database I
> need to be able to tell people what the schema name is. Now, I could

Do you need to tell people? When you parse the ADQL coming in, would you not
simply add the schema name into the generated SQL? If you have the same
table name in separate schemas then you simply register separate services
(or the same service with different endpoints). No?

T.

> -----Original Message-----
> From: owner-dal at eso.org [mailto:owner-dal at eso.org] On Behalf Of Patrick
> Dowler
> Sent: 11 October 2007 17:40
> To: dal at ivoa.net
> Subject: Re: TAP information schema
> 
> On 2007-10-10 06:10, Keith Noddle wrote:
> > Cases so dictate. Finally, it was made abundantly clear to us in
> Beijing
> > - and it remains the case - that the priority for TAP V1.0 is to
> define
> > how we handle ADQL querying. Period. No arguments.
> 
> I agree with this 100%. We all agree that TAP 1.0 should be a minimal
> spec we
> can move forward with and at the core this means doing ADQL querying.
> 
> As for metadata, one really does need more than tables and columns in
> the
> general case. Specifically, some RDBMSs require that the SQL contains
> the
> schema name (DB2, eg) on the front of every table name. I do not think
> that
> ADQL requires this (maybe shouldn't) but as a site using such a
> database I
> need to be able to tell people what the schema name is. Now, I could
> stretch
> the table name to include it (eg mySchema.myTable) but that actually
> throws a
> lot of stuff away (like the fact that I use different schemata for
> different
> versions) and would like to describe what each each schemameans, and
> that
> maybe the schema as a whole implements some data model -- as would
> likely be
> the case since few data models can be sensibly stored in a single
> table).
> 
> That's not a big deal right now, but if we ignore it and force services
> and
> apps to ignore schema names then in future we could have some problems
> when
> we try to expose it. The same goes for what metadata tells people how
> to
> write more complex queries with joins etc... we probably should not
> standardise now but we need to do it in a way that doesn't make the
> future
> detailed metadata still the definitive metadata.
> 
> So, my gut feeling eight now is that basic resource discovery in the
> registry
> is going to use VOResource (or some specialisation of that) and users
> need to
> be able to see what the content is (tables and columns) for that task.
> We
> should aim to support that task only -- suitable content discovery --
> and we
> should not try very hard to make that VOResource description the way to
> actually formulate queries (just "accidentally on purpose" as a friend
> used
> to say :-)
> 
> What I am thinking is this: the "suitable content discovery" will
> describe
> content, which effectively means tables and columns: assuming there was
> detailed metadata for building queries elsewhere, you still need to ask
> for
> it so the VOResource needs to have the scheme (namespace) and table
> names and
> because people will be looking for things via utype and/or ucd of
> columns...
> the only thing not really needed for discovery that we can stick in so
> people
> can write queries are the actual column names*. Once we have a detailed
> metadata system for TAP 1.1 we could deprecate the column names in the
> VOResource, or not if no one cares enough.
> 
> * nominally, discovery doesn't care about units either, but practically
> client
> software will care if they don't have some generic unit conversion
> utility
> 
> Summary: VOResource describes tables and columns (maybe namespaces aka
> schemata) aimed at "suitable content discovery", but we stick in column
> names
> and units for completeness/symmetry with the table description. The
> service
> emits this document via the standard service method. This is good
> enough for
> full ADQL queries of single tables, with joins reserved for users that
> actually knows the target schema or care to learn it via documentation.
> 
> This would be "good enough" and not shut off any future development.
> 
> --
> 
> Patrick Dowler
> Tel/Tél: (250) 363-6914                  | fax/télécopieur: (250) 363-
> 0045
> Canadian Astronomy Data Centre   | Centre canadien de donnees
> astronomiques
> National Research Council Canada | Conseil national de recherches
> Canada
> Government of Canada                  | Gouvernement du Canada
> 5071 West Saanich Road               | 5071, chemin West Saanich
> Victoria, BC                                  | Victoria (C.-B.)