xmin/xmax in postgres-based TAP services?
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Tue Nov 14 13:19:55 CET 2017
Dear DAL,
in particular operators of TAP services over postgres,
the mapping between database column names and VOTable column names --
current spec: "For VOTable output, the name attribute of FIELD
elements must be the same as the column names (or aliases if
specified in the query) in the query" -- has all kinds of interesting
subtleties. I've discovered a new one the other day, in connection
with table uploads.
Actually, the upload section doesn't have an analogous regulation
("the name attributes of FIELDs in uploaded tables must match the
column names in the database" -- should it? of course, there are
more subtleties in there, too, such as that FIELD/@names can be
repeated in a single VOTable...). I guess one still can take that as
implied.
For names that collide with reserved words, quoting helps, and that's
what DaCHS does.
However, I recently noticed that there's a couple of names in
Postgres that are, in a way, hyper-reserved in that the respective
columns exist in *all* tables and there's nothing you can do about
it. In particular, you can't have your own columns with these names.
That's, specifically, oid, tableoid, xmin, cmin, xmax, cmax, ctid.
In particular xmin, xmax and cmin, cmax have a large potential of
hurting. I'm sure there are lots of tables out there that have these
names -- a quick search in the relational registry turns up 9 "cmin"
columns in VO-published tables.
The question is: What do we do?
If a TAP service had the right to modify column names in uploaded
tables, the whole table upload business becomes, I think, unusable.
So, that's a no-no.
TAP probably cannot (really, usefully) forbid certain column names in
uploads, and even if we wanted that, TAP obviously shouldn't codify
Postgres reserved names -- so having a blacklist of certain names
isn't an escape route either.
DaCHS currently simply fails when you use one of the dirty name, and
I was pleasantly surprised that it even gives a reasonably
understandable error message ("column name 'xmax' conflicts with a
system column name"; I guess it should add what table xmax is in, but
it's a start).
The question is: Am I invalid with this behaviour? The current spec
says "Table upload must support all valid VOTable content", so an
ill-meaning validator could fail me. It also says "Services may
limit the size and number of uploaded tables; if the service refuses
to accept the entire table it must respond with an error as described
in section 3.3." -- there's no provision in here letting me refuse a
table just because I don't like the names in there.
So I guess by the current spec DaCHS *is* in violation of the TAP
spec.
I suppose I *could* re-write both uploads and queries to
transparently rename the columns to postgres-palatable names (with
additional subtleties, because the mangled names might clash with names
already in the table...).
But is it worth it? What do other people do? I'm particularly
intrigued by the appearance of three "cmin" columns in VizieR, I have
to say...
-- Markus
More information about the dal
mailing list