Qualified/unqualified quoted/unquoted
Dave Morris
dave.morris at metagrid.co.uk
Thu Oct 16 16:58:31 CEST 2014
On 2014-10-16 10:39, LANDAIS Gilles (OBS) wrote:
>
> Due to the important volumetry (~25,000 table , ~ 300,000 columns),
> the web resource /tables of TAPVizieR provides the schema without the
> columns descriptions.
> This output enables web applications like TAPHandle to work with
> TAPVizieR with a reasonable size of VOTable.
> Currently, TAPVizieR provides a non-standard REST URL to get the full
> table description (with columns). The output URL uses the same XML
> schema than the standard resource /tables (VODataService/1.1).
>
> Example:
> http://tapvizier.u-strasbg.fr/TAPVizieR/tap/tables
> http://tapvizier.u-strasbg.fr/TAPVizieR/tap/tables/II/246/out
>
Publishing the metadata for the TAPVizieR service highlights some gaps
in the current VODataService and TAP_SCHEMA specifications that will
need to be clarified before this service can interoperate with other TAP
services in the VO.
This is not the fault of the TAPVizieR service, this is due to some
omissions in the current VODataService and TAP_SCHEMA specifications
which are not precise enough to handle the TAPVizieR metadata.
----
The /tables endpoint
http://tapvizier.u-strasbg.fr/TAPVizieR/tap/tables
lists 477 instances of table names with two dots but no quotes.
For example :
<name>vbig.J/other/PZ/29.1/table</name>
----
Section 3.3 of the VODataService-1.1 specification defines the <name>
element as containing :
"A fully qualified name for the table."
"This name should include all catalog or schema
prefixes needed to sufficiently uniquely
distinguish it in a query to the table."
However the VODataService-1.1 specification does not describe how to
handle a table name that includes non-delimiter dots in it.
----
Based on a literal reading of the text in the VODataService-1.1
specification
"A fully qualified name for the table."
Implies that a /tables result containing
<name>vbig.J/other/PZ/29.1/table</name>
refers to
a catalog called
'vbig'
a schema called
'J/other/PZ/29'
a table called
'1/table'
whereas a human interpreter may guess based on context that this
actually refers to
a schema called
'vbig'
a table called
'J/other/PZ/29.1/table'
----
The current VODataService-1.1 specification needs to be updated to
describe how the /tables output should use quotes to wrap names that
contain non-delimiter dots or other characters outside the basic set of
alphanumeric characters.
----
In this example the schema and table names should probably be wrapped in
double quotes to indicate which dot is part of the table name and which
is the delimiter between schema and table.
<name>"vbig"."J/other/PZ/29.1/table"</name>
----
The same table metadata is also available from the TAPVizieR TAP service
http://tapvizier.u-strasbg.fr/TAPVizieR/
via a TAP_SCHEMA query
"SELECT schema_name, table_name FROM TAP_SCHEMA.tables"
which returns a VOTable containing
<TR>
<TD>vbig</TD>
<TD>J/other/PZ/29.1/table</TD>
</TR>
----
Section 2.6 of the TAP-1.0 specification defines the table_name column
as
"table name as it should be used in queries"
The text below this adds a bit more detail to the definition, but it is
still less specific about qualifying the table name than the equivalent
text in the VODataService-1.1 specification
"The value of the table_name should be
the string that is recommended for use
in querying the table; it may or may not
be qualified by schema and catalog name(s)
depending on the implementation requirements."
Given the current definition of 'may or may not be qualified', the table
name in this example could be interpreted as
a schema called
'J/other/PZ/29'
a table called
'1/table'
or as
a table called
'J/other/PZ/29.1/table'
From context we can guess that this does in fact represent the
unqualified table name containing a non-delimiter dot.
But this is a *guess*, and is not covered by the rules for representing
qualified or unqualified names that may or may not contain non-delimiter
dots.
----
The current TAP-1.0 specification needs to be updated to describe in how
the metadata in the TAP_SCHEMA tables should use quotes to wrap names
that contain non-delimiter dots or other characters outside the basic
set of alphanumeric characters.
----
In this example the table name in the table_name column should probably
be wrapped in double quotes to indicate that the dot is part of the
table name and not a delimiter between schema and table.
<TD>"J/other/PZ/29.1/table"</TD>
----
For comparison, sending the same TAP_SCHEMA query to the Gavo TAP
servicve
http://dc.zah.uni-heidelberg.de/__system__/adql/query/form
"SELECT schema_name, table_name FROM TAP_SCHEMA.tables"
returns a VOTable containing
<TR>
<TD>twomass</TD>
<TD>twomass.data</TD>
</TR>
If we apply the same parsing rules that we used for the TAPVizieR
results, then this could refer to
a schema called
'twomass'
and a table called
'twomass.data'
or this could refer to
a schema called
'twomass'
and a table called
'data'
Applying the same set of parsing rules that were needed to interpret the
TAPVizieR TAP_SCHEMA results to the Gavo TAP_SCHEMA results mean that
the table names in the Gavo TAP_SCHEMA results may be open to
misinterpretation.
Note - there is nothing in any of the specifications that says that we
cannot have combinations of catalogs, schemas, tables and columns with
the same names.
Just because the table name 'twomass.data' starts with the same
sub-string as the schema name 'twomass' does not by itself mean that
'twomass.data' is the qualified table name including the parent schema
name and delimited by a dot, rather than a table name which just happens
to start with the same sub-string as the parent schema name and contain
a non-delimiting dot.
----
We could simplify the parsing rules by defining both the schema name and
table name as always unqualified, removing the need for using quotes
within the metadata.
<TR>
<TD>vbig</TD>
<TD>J/other/PZ/29.1/table</TD>
</TR>
and
<TR>
<TD>twomass</TD>
<TD>data</TD>
</TR>
Note - in order to use the fully unqualified schema name we would have
to add a separate column/element to the metadata to contain the catalog
name.
----
We could simplify the parsing rules by making the table names always
fully qualified and always wrap all the names in quotes.
<TR>
<TD>"vbig"</TD>
<TD>"vbig"."J/other/PZ/29.1/table"</TD>
</TR>
and
<TR>
<TD>"twomass"</TD>
<TD>"twomass"."data"</TD>
</TR>
Note - the schema name also needs to be quoted because schema names may
be qualified with a catalog name and both the schema and catalog names
may themselves contain non-delimiter dots or other non alphanumeric
characters.
----
We could try to define a more complex set of conditional rules which
work for both the Gavo and TAPVizieR metadata and are compatible with
the existing service and client implementations.
----
What do you think ?
Anyone like to have a go at defining the rules for qualified/unqualified
quoted/unquoted names ?
--------
Dave Morris
Software Developer
Wide Field Astronomy Unit
Institute for Astronomy
University of Edinburgh
--------
More information about the dal
mailing list