xmin/xmax in postgres-based TAP services?

Grégory Mantelet gmantele at ari.uni-heidelberg.de
Wed Nov 15 14:17:30 CET 2017


Dear Markus,

I fear I currently do the same as you in my TAP Library: I just quote 
table and column names, but I do not do any smart rename.

But considering the issue you raised, I will probably do so now. I 
already offer the possibility to have a different name in the database 
(what I call SQL names) than in the ADQL queries (ADQL names). So that 
should not be that difficult in my case to change behind the scene the 
SQL name of all uploaded tables' columns. It won't affect the ADQL 
queries and should run as expected in Postgres (and other supported 
databases).

I personally don't see any way to standardize column names of uploaded 
tables. Users will surely do as they want and putting more constraints 
on the input is probably not a good idea if we want them to use TAP 
willingly. So that's probably something that TAP servers should deal 
with....though I admit it is not always that straightforward to do.

Cheers,
Grégory


On 11/14/2017 01:19 PM, Markus Demleitner wrote:
> Dear DAL,
> in particular operators of TAP services over postgres,
>
> the mapping between database column names and VOTable column names --
> current spec: "For VOTable output, the name attribute of FIELD
> elements must be the same as the column names (or aliases if
> specified in the query) in the query" -- has all kinds of interesting
> subtleties.  I've discovered a new one the other day, in connection
> with table uploads.
>
> Actually, the upload section doesn't have an analogous regulation
> ("the name attributes of FIELDs in uploaded tables must match the
> column names in the database" -- should it?  of course, there are
> more subtleties in there, too, such as that FIELD/@names can be
> repeated in a single VOTable...).  I guess one still can take that as
> implied.
>
> For names that collide with reserved words, quoting helps, and that's
> what DaCHS does.
>
> However, I recently noticed that there's a couple of names in
> Postgres that are, in a way, hyper-reserved in that the respective
> columns exist in *all* tables and there's nothing you can do about
> it.  In particular, you can't have your own columns with these names.
> That's, specifically, oid, tableoid, xmin, cmin, xmax, cmax, ctid.
>
> In particular xmin, xmax and cmin, cmax have a large potential of
> hurting.  I'm sure there are lots of tables out there that have these
> names -- a quick search in the relational registry turns up 9 "cmin"
> columns in VO-published tables.
>
> The question is: What do we do?
>
> If a TAP service had the right to modify column names in uploaded
> tables, the whole table upload business becomes, I think, unusable.
> So, that's a no-no.
>
> TAP probably cannot (really, usefully) forbid certain column names in
> uploads, and even if we wanted that, TAP obviously shouldn't codify
> Postgres reserved names -- so having a blacklist of certain names
> isn't an escape route either.
>
> DaCHS currently simply fails when you use one of the dirty name, and
> I was pleasantly surprised that it even gives a reasonably
> understandable error message ("column name 'xmax' conflicts with a
> system column name"; I guess it should add what table xmax is in, but
> it's a start).
>
> The question is: Am I invalid with this behaviour?  The current spec
> says "Table upload must support all valid VOTable content", so an
> ill-meaning validator could fail me.  It also says "Services may
> limit the size and number of uploaded tables; if the service refuses
> to accept the entire table it must respond with an error as described
> in section 3.3." -- there's no provision in here letting me refuse a
> table just because I don't like the names in there.
>
> So I guess by the current spec DaCHS *is* in violation of the TAP
> spec.
>
> I suppose I *could* re-write both uploads and queries to
> transparently rename the columns to postgres-palatable names (with
> additional subtleties, because the mangled names might clash with names
> already in the table...).
>
> But is it worth it?  What do other people do?  I'm particularly
> intrigued by the appearance of three "cmin" columns in VizieR, I have
> to say...
>
>        -- Markus




More information about the dal mailing list