TAP, ADQL and field names inconsistencies between providers.

cpmjg at tutanota.com cpmjg at tutanota.com
Sat Dec 5 00:59:45 CET 2020


I believe this is the working group for suggesting the standardisation of such things?

The release of GAIA EDR3 has highlighted this with its separate TAP providers with not only more than one field naming scheme but at least three different pedantic syntaxes for the naming of said!

TAP VizieR has different field header names, possibly the nature of VizieR itself causes this, but it is a shame they don't keep the same field heading names as the provider of the table.

Then it is also served by ARI and AIPS TAP services, and it is all ADQL, EXCEPT

one uses

SELECT "gaiaedr3source".ra
FROM "gaiaedr3source"

and all WHERE and similar statements need the same syntax as select,

another uses

SELECT "gaiaedr3source"."ra"
FROM "gaiaedr3source"

again, WHERE etc need the same syntax as SELECT

and yet another uses

SELECT ra
FROM gaiaedr3source

and things like ra < 9 or ORDER BY ra work.

There really needs to be some standardisation on ADQL more than there is, this point seem to get missed.  Sometimes, if there are no examples provided (and if there are examples it is always the thoroughly uninformation "SELECT TOP 10" without specific fields noted in order to figure out the syntax)

Now I'm not a programmer or a technician nor a professional astronomer but the fact that once I manage to get my brain around cobbling some ADQL together that may actually work I can spend ages trying to debug it only to realise after a while it's the field name syntax!  As most error messages unhelpfully say "expecting this that and the other command following )" or something, it is not always clear.

In other words, there is a lack of standard here in the ADQL standard and it needs to be addressed.

I of course prefer

SELECT ra
FROM source
WHERE ra <= 2 AND ra IS NOT NULL

(I'm not sure there's a standard on null either, and it is hard to look up the proper syntax on some sites)

I will likely get told that things like JOINs make that poor code, well it doesn't, the few sites that use this shorter version (eg IRSA IPAC SQL search) where you may want to join you can simply go

SELECT source1.ra, source2.ra, mag, z
FROM source1, source2

as needed, but still only use unique to each table fieldnames without having to put the source in for those.

And whilst I'm having a gripe some of the very most common fieldnames need to be standardised in these tables across all tables, not just different providers of the same tables.

RA_ICRS and RA_J2000 for instance, instead of the amazing number of variations on a theme that something simple like these two can be between different datasets, it'd make porting scripts a heck of a lot easier.  In which case ra should probably be deprecated, or made to default to mean J2000.

Of course there wouldn't be many of these, just the verymost common things like RA and Dec in degrees.

Today, having given up on TAP VizieR as not doubt it was being choked on GAIA EDR3 queries, I decided to try both ARI and AIPS tap servers.  You can look things up and cut and paste the field names for each, and even double click on field names to autopopulate the input dialogue on one of them, but you shouldn't have to.  Typing is often quicker then clicking through long lists.

But I copied and pasted the ADQL from VizieR to the other two.  Well, VizieR used different fieldnames, so in the ARI I renamed them, and that didn't need the quotes nor the proceeding table name for fields, just the fieldname, so I deleted those all one by one, and then I copied that and pasted it into AIPS as that had the same field names, but no, that needed them all to be preceded by the table name again, with a . between the two!  And then it had to be put in quotes!  Both table name and field name, just just table name as in VizieR.

It really looks sloppy and unprofessional for a standard protocol and script for said protocol to have about five different ways to SELECT etc fields!

"table"."field"

"table".field

table.field

field (with the table implicit from the FROM entry but added to the field name when more than one table used and there are common fieldnames in those tables, which I must say I find easiest to follow and understand and most of all read when checking the script!).

and one I can't remember, possibly "field".

Yours

John

John Greaves

Not an IAU member, not an IVOA member, not a scientist, the do make me pay tax though if that helps.



More information about the dal mailing list