IVOA Support Interfaces 1.1

Mon Jun 22 14:50:17 CEST 2015

Hi GWS,

On Fri, 19 Jun 2015, Markus Demleitner wrote:

> Dear GWS,
> 
> On Fri, Jun 19, 2015 at 10:05:59AM +0200, Marco Molinaro wrote:
> > Then, I have no specific preference on the choice. I thought I'd
> > prefer 1, but I'm not sure.
> > I think that the change in the /tables endpoint comes out of TAP
> > revision mainly, and probably is related to whether we want to expose
> > actual tablesets or structured tables inside some hierarchy.
> > My feeling is that the use cases are more for tables collected, less
> > for how they are organized in catalog.schema.* solutions.
> 
> Right -- also note that VOSI tableset is used in many single-table
> services like our S-protocols, and for those schema is particularly
> artificial.
> 
> > So, if we think the model is flawed, probably it's better to fix it.
> > However this means asking providers to change their XML outputs...and
> > that is a major thing...that's why I'm probably ok whatever the
> > majority of involved parties choose.
> 
> I don't think I'd outlaw the schema element; essentially, we'd just
> re-allow having  tables directly in the tableset (a bit like it was
> in VODataService 1.0, except avoiding the mistakes made in that
> initial draft).  Doing that also makes a lot of sense since it's
> called tableset and not schemaset :-).  Whatever is legal now will
> remain legal under that scheme.

I think that would be OK.  The VOSITables tableset element could
contain <table> elements either within or outside of <schema> elements,
according to the preference of the service.

> Then Kristin said:
> 
> > We have a service with multiple MySQL-databases (schemas), and many tables for
> > each of them. So I would want to have the possibility to just list the schemas
> > (without all the tables) and also filter by schema (listing tables for just
> > one schema).
> 
> I'm still trying to find a good example in which having an extra
> level of indirection actually has an advantage.  It doesn't for sites
> like VizieR (where listing the four schemas is relatively pointless,
> and each schema won't be noticeably smaller than the full thing), and
> it doesn't for sites like us (where we have about as many schemas as
> tables, so /tables/schema wouldn't be much simpler than /tables
> itself if we allow empty table bodies).
> 
> Maybe yours is one -- it sounds as if the cardinality of both /schema
> and each /schema/tables would comparable, and the (by itself
> desirable) log-decrease in cardinalities actually happens.

I actually think Markus's GAVO DC service is an example where the
schema/table hierarchy is useful rather than otherwise,
though I admit it doesn't make a huge difference in that case.
E.g. wrapping up the TAP_SCHEMA tables in one node is quite tidy.
As you say, there may be other services where this applies more
strongly.

> Another possible extra benefit would be if there's significant
> metadata hanging on the schema that people might want to separately
> retrieve.  In tableset, that's mainly the table descriptions (schema
> description, ucd, utype would probably available top-level, right?).
> In hierarchical schema browsing, I can see there can be a benefit
> there, but again I feel including this table metadata top-level will
> not significantly compromise scaleability.

And I think that's another good argument for keeping schema-level
metadata: it avoids having to repeat schema-level information in
the table description.  I'm not thinking so much of scalability
here as of readability when browsing a hierarchical schema/table
structure.

> That has to be weighed against the certainty that all of <tablename>,
> <schema>.<tablename> and <catalog>.<schema>.<tablename> will have to
> be shoehorned into the /<schema>/<tablename> URL schema.  This
> shoehorning will presumably get worse as other query languages will
> be transported through TAP.

Well yes, the nomenclature is/will be confusing for sure, but I'd
say that the group-of-tables/table split is likely to be a reasonable
abstraction as far as metadata presentation goes.  (And following
your suggestion above about the schema-less option for tableset
documents, services that feel that's too complex can just list
ungrouped tables).

However, the question of the structure of the tableset document
does not have to be reflected in the VOSI endpoint query.
Although I support continuing to have the schema/table hierarchy in
the tableset document, I still think it would be preferable
to allow querying from the /tables endpoint by table name alone,
rather than requiring use of the schema.
I'll repeat my earlier comment on Pat's original proposal:

On Thu, 7 May 2015, Mark Taylor wrote:

> The hierarchical <base_url>/<schema_name>/<table_name> scheme used
> here means that you need to know the schema name for a table in order
> to query the details (e.g. columns) for that table.
> Since, as per the recent table_name discussion on the DAL list,
> the table_name must already be fully qualified, i.e. lives in a flat
> namespace, it's not clear that this is a good idea.
> If you're iteratively querying the /tables endpoint from the top
> down to reach a table of interest, that may not matter,
> since you presumably have the schema/table hierarchy already[*].
> However, if for instance you're trying to parse and validate
> some ADQL from scratch, you may only have the table_name,
> and no indication of what schema it lives in (unless you're allowed
> to pull the table name apart in order to guess, which we have
> established elsewhere is not reliable), so you couldn't use this
> service to find out the table's columns.
>
> That would argue instead for something like
>
>    <base_url>?schema=<schema_name>
>    <base_url>?table=<table_name>
>
> rather than
>
>    <base_url>/<schema_name>
>    <base_url>/<schema_name>/<table_name>
>
> (the detail=<level> parameter can still get appended using an
> ampersand separator in the usual way).

So to answer Brian's original question:

On Thu, 18 Jun 2015, Brian Major wrote:

> Should the REST endpoint match the, potentially flawed, XML model which
> includes the dbSchema (option 1), or should we try to correct the new
> /table REST endpoint now and fix the XML model later? (option 2).  (If I've
> misunderstood the arguments please correct me.)

I'd say correct the REST endpoint now, but don't fix the XML model later
(at least, don't fix it much - allow the option of schema-less documents,
but don't outlaw the schema level of description altogether).

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/