VOSI 1.1 - qualified names

Mark Taylor M.B.Taylor at bristol.ac.uk
Wed Jul 20 11:12:25 CEST 2016


Hi Dave.

On Tue, 19 Jul 2016, Dave Morris wrote:

> I agree with Mark that we should be consistent across the different metadata
> sources. However I am concerned that the term 'official name' implies a level
> of standardization beyond the consensus that we have achieved. To an external
> user, referring to [twomass.data] as the 'official name' may imply that we
> have standardized this as the name for the twomass point source catalog across
> the whole of the VO.

I agree that "official name" is not a very good choice of terminology
and can probably be improved.  Having said that I don't think there's
too much danger of confusing external users; this term/concept is
only going to be used in IVOA standards documents, it's not something
the user has to see or think about.

I will suggest as an alternative "canonical name".  I think it's
accurate, though I agree it's a bit obscure.  However (a) users
won't see this term, so it doesn't matter if it would stump astronomers
and (b) it's distinctive enough that people will probably recognise
it as a term they've seen elsewhere when it crops up in a standard.
I'm happy if a better suggestion can be found, but I'm not keen 
on "optimally qualified", for the reasons below.

> May I suggest an alternative term, 'optimally qualified', defined as "the
> optimal level of qualification that makes a name unique in the context within
> which it is being used".
> 
> Taking the GAVO TAP service as an example, there are several tables called
> 'data', so the optimally qualified name would need to include the schema name
> to make them unique.
> 
>     [unqualified name]       [data]
>     [optimally qualified]    [antares10.data]
>     [fully qualified]        [antares10.data]
> 
>     [unqualified name]       [data]
>     [optimally qualified]    [veronqsos.data]
>     [fully qualified]        [veronqsos.data]
> 
> On the other hand, at the moment there is only one table called
> 'unidentified', so the optimally qualified name could/should use the short
> name.
> 
>     [unqualified name]       [unidentified]
>     [optimally qualified]    [unidentified]
>     [fully qualified]        [arigfh.unidentified]

Well, no.  Services have their own rules about table name syntax.
As a matter of fact, DaCHS/GAVO does not permit you to omit the
"<schema>." part of a table name, so even in absence of multiple
tables with the name "*.unidentified", the "optimally qualified"
version you quote above won't work.  Other TAP implementations
(e.g. ones based on Gregory Mantelet's parsers) typically do allow
that shortcut.  TAP/ADQL does not attempt to legislate about this
kind of thing, nor should it.  For that reason, I don't think
that the optimally qualified concept is useful as part of the
standards landscape.

Instead I believe the canonical name should be opaque (which is my
understanding of the way it works now); it's whatever the service
says it is, with whatever qualifications/delimitations
the implementation sees fit to apply.  The only rule is
that the same name has to be used in all relevant places,
and accepted in queries.
Services *may* at their option also allow different forms of the
name (e.g. with less or more qualification, or even aliases etc)
in submitted queries, though using such forms might fox
TAP_SCHEMA-based ADQL validation.

Given that, I don't think that we should say anything in the
standards about what other forms of the name might look like
or when they can or should be used, so I don't think it's going to be
helpful to define {Un,Optimally ,Fully }qualified names.

> However, if GAVO add another schema that contains a table called
> 'unidentified', then the optimally qualified name would have to be updated to
> include the schema name in order to distinguish between them.
> 
>     [unqualified name]       [unidentified]
>     [optimally qualified]    [arigfh.unidentified]
>     [fully qualified]        [arigfh.unidentified]
> 
>     [unqualified name]       [unidentified]
>     [optimally qualified]    [newschema.unidentified]
>     [fully qualified]        [newschema.unidentified]
> 
> I agree with Mark, it is up the the service provider to decide what level of
> qualification is appropriate for the optimally qualified names in their
> service.
> 
> We can stress that in a client-server scenario, the optimally qualified name
> is the name that the client application would normally present to the user,
> and as such it should be as simple as  possible while still ensuring that it
> is sufficiently unique to be used as-is in queries and examples without
> requiring the user to add additional qualification, but I don't think we can
> mandate that a service always uses the minimum level of qualification in the
> optimally qualified name.

So I agree with you that the form of the name is not mandated,
but I disagree that there should be an expectation to use any
particular form of the name under "normal" circumstances - I say we
just leave that to the services' implementation requirements and
common sense.

As far as I understand it, I'm not proposing anything new here,
only a clarification of wording in the standards, possibly
backed up by a well-defined term ("canonical [table] name" or
something along those lines).

Mark

> If for example, the GAVO team had plans to add another schema containing a
> table called 'unidentified' in a few months time, then even though the
> unqualified name is currently unique, they may choose to include the schema
> name in the optimally qualified name now, in preparation for the introduction
> of the new schema later.
> 
> I suggest we should define these three terms in the TAP specification and then
> we can refer to them in the TAP, ADQL and VOSI standards.
> 
> Suggested definitions :
> 
>     Unqualified (short name)
> 
>         * The name of a catalog, schema, table or column, without reference to
> the object's parents in the hierarchy.
>         * An unqualified name MUST NOT include qualifying references to the
> object's parents, even if it means that the name is not unique in the context
> within which it is being used.
> 
>     Optimally qualified (optimal name)
> 
>         * The name of a catalog, schema, table or column with a an optimal
> level of qualification sufficient to make it unique in the context within
> which it is being used.
>         * An optimally qualified name MUST include sufficient qualification to
> make it unique in the context within which it is being used.
>         * An optimally qualified name MAY include more qualification than
> required to make it unique in the context within which it is being used.
> 
>     Fully qualified (full name)
> 
>         * The full name of a catalog, schema, table or column, including the
> names of all the parent objects in the hierarchy.
>         * A fully qualified name MUST include qualifying references to all of
> the object's parents, even if they are not required to make the name unique in
> the context within which it is being used.
> 
> What do you think. Would defining these terms be useful ?
> 
> Hope this helps,
> Dave
> 
> --------
> Dave Morris
> Software Developer
> Wide Field Astronomy Unit
> Institute for Astronomy
> University of Edinburgh
> --------
> 
> 
> 
> 

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/


More information about the grid mailing list