TAP questions

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu Mar 1 10:14:06 CET 2018


Hi László,

On Wed, Feb 28, 2018 at 05:35:47PM +0100, Dobos, László wrote:

> 1. The first question is about how schema and table names should be handled
> by the TAP_SCHEMA view. For instance, the gavo TAP endpoint at

The TAP spec says (TAP 1.1 Draft, p.24:

  The value of the table_name should be the string that is recommended
  for use in querying the table; it may or may not be qualified by
  schema and catalog name(s) depending on the implementation
  requirements.  [...] If the table name is such that the name must be
  quoted (delimited identifier in ADQL) then the value must include the
  quotes.

(similar, perhaps a bit more amibguous, language is in TAP 1.0).  So,
the bottom line is: take what's in table_name and don't touch it.
It's the operator's responsibility to get that right.

> the table_name column and schema_name is separate. It is straightforward to
> remove the schema name from the table_name column if it is the same as
> schema_name but it's not so straightforward to compose a query, which, for
> instance, gets the columns of a given table if I only know the schema name
> and the table name separately. Or should I go for compatibility across

That should not happen. I've always lobbied for having table_name in
tap_schema.columns to be an explicit foreign key into
tap_schema.tables, and certainly GloTS handles it like this (because
I'd go crazy otherwise).  The TAP spec, as far as I can see, doesn't
explicitly require it, but if someone uses different strings in
tables.table_name and columns.table_name, they'll not show up in
TOPCAT right now.

Anyway, there's no sane way to discover the columns otherwise.  So:
again, just don't touch the table_name.

> On the other hand the GAVO tap interface at throws an error for quoted table
> names while handles quoted columns correctly:
> 
>  
> 
> SELECT  TOP 10 "raj2000", "dej2000" FROM "fk6"."part1" -- results in error
> "'QuotedName' object has no attribute 'upper'"

Ok, that's a bug, and I'll fix it, but it would still be wrong to
wantonly add quotes.  Delimited identifiers have very funky
properties, and any number of things can go wrong if you assume
fk6="fk6" in a SQL database (starting with the fact that SQL92
requires fk6="FK6" (if anything of that sort) -- but really, there's
no telling).

Rule: *never* convert a SQL regular identifier into a delimited
identifier unless you know what you're doing and why.  Which
essentially is never the case with TAP/ADQL.  Use the form provided
by TAP_SCHEMA (or, hopefully equivalently, /tables).

> 3. My third question is about how to deal with missing or wrong xml
> namespaces. This is often an issue with /capabilities. For example, a number
> of services (at least http://heasarc.gsfc.nasa.gov:80/xamin/vo/tajp) returns
> an xml with the namespace http://www.ivoa.net/xml/TAP/v1.0 which gives me a
> 404. Is it something that's allowed by the standard or I'm supposed to come

While namespace URIs don't need to resolve in general, IVOA ones do,
so if they point nowhere, they're probably wrong.  In this case, it
should be

http://www.ivoa.net/xml/TAPRegExt/v1.0

The standards-compliant way to handle bad namespace URIs is to fail;
as far as XML is concerned the type

{http://www.ivoa.net/xml/TAP/v1.0}TableAccess

(as declared by xamin now) has no relationship whatsoever to the type

{http://www.ivoa.net/xml/TAPRegExt/v1.0}TableAccess

that the standard mentions and that clients should expect.

If I were to write a client, however, I'd follow the golden rule of
interoperabiltiy: Be strict in what you generate and lenient in what
you accept.  If what's coming back looks like a TAPRegExt capability,
I'd swallow it.  Leave it to the validators to nitpick.

Incidentally, I'm still not sure why you would want to resolve the
namespace URI.  If you feel you must, I'd argue you have a problem.
The IVOA servers certainly aren't designed to get a couple of hits
for the schemas per TAP request, globally.

> up with a workaround? A similar issue is with the VOTables returned by, for
> example by http://datalab.noao.edu/tap and
> https://heasarc.gsfc.nasa.gov/xamin/vo/tap/ , which lack the default
> namespace:

Don't handle VOTable yourself if you're programming in a halfway
standard language -- use a VOTable library.  It'll probably do the
right [TM] thing (which often includes a bit of fudging).

[though you're right, these two are invalid VOTable; they'd need an
xmlns:"xmlns:http://www.ivoa.net/xml/VOTable-1.2.xsd" in their roots;
and only then will the
xsi:noNamespaceSchemaLocation="xmlns:http://www.ivoa.net/xml/VOTable-1.2.xsd"
that's present do its magic].


And on your other point:

> There's a few services which ping-pong the client between http and https by
> sending a 302 or 303 when the service url is accessed without https but then
> after a POST to /async, the 302 URL is a http://... but sending a GET to it
> redirects further to https://. This sort of breaks client logic because if I
> turn on automatic redirect follow in the http client library then it gets
> redirected even after the first /async POST. But turning on automatic
> redirect follow in a client lib is a dangerous thing anyway, especially if
> it's running in a server environment.
>
> One example to this is the TAP endpoint at
> http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/
>
> I can again come up with workarounds but specifying in the standard that
> redirects should go to the same URL where the POST was sent to would be
> better.

So, the way UWS is supposed to work, you are never expected to
re-POST the parameters.  You POST them, and you get a redirect, but
that you can just GET without any parameters.

That's good, because POST and redirects don't mix well (RFC 2616,
10.3: "The action required MAY be carried out by the user agent
without interaction with the user if and only if the method used in
the second request is GET or HEAD.")

Now, it's conceivable that individual operators mix this with the
(IMHO annoying) practice of redirecting http to https.  In the
presence of a POST, that's a bug and should be reported as such (even
for GETs, I'd say that's fairly odious).

       -- Markus



More information about the dal mailing list