ADQL XMATCH

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Apr 12 09:23:23 CEST 2016


On Mon, Apr 11, 2016 at 10:57:16AM -0700, Patrick Dowler wrote:
> We have several catalogues in our TAP service with the coordinates in
> a column described with xtype="adql:POINT" (lets ignore the details of
> the adql prefix for now).  If the query on those tables uses that
> column, the relevant indexing comes into play. It is true that the
> tables also have separate RA and DEC columns and in principle I could
> detect DISTANCE(RA, DEC, uploaded.c1, uploaded.c2) and replace RA, DEC
> with the POS column, but what do I do if:
> 
> - query refers to the wrong columns in the table (e.g. DISTANCE(foo,
> bar, uploaded.c1, uploaded.c2)
> - query just gets them in the wrong order (e.g. DISTANCE(DEC, RA,
> uploaded.c1, uploaded.c2)
> 
> I would be inclined to have the job fail rather than run it. It makes
> me wonder why we would make the user put the two coordinates together

Let me support Mark's dislike for failing executable ADQL queries --
there's the old saying that Unix doesn't keep you from doing stupid
things because that would keep you from doing clever things.  I've
found that to be true on several occasions, and I believe the same
(should be) true of query languages.

As to point arguments or split arguments: I'm sure most current
astronomers will ask for the split-argument version, if only because,
as Mark points out, almost all current tables are written like that.
I don't see query morphing to be *much* more difficult in either
direction, so that's probably not a big deal either.

Grammatically, it would not be a big deal to support both; right now,
the rule for distance is:

 <distance> ::=     
     DISTANCE <left_paren> <coord_value> <comma> <coord_value> <right_paren>

and 

  <coord_value> ::= <point> | <column_reference>

If we wrote

  <coord_value> ::= <point> | <column_reference> | <coordinates>

we'd be done (in the unlikely case you don't have the ADQL grammar in
front of you, <coordinates> is

  <coordinates> ::=  <coordinate1>  <comma>  <coordinate2>).

coord_value is otherwise only used in COORD1 and COORD2, so there are
few side effects.  Until I've actually written the code, I can't make
promises, but my feeling is that this should be no big deal in actual
implementation either.

On Arnold's point this should have been ANGULAR_DISTANCE: That's
probably true, and I'm always on the side of long, explicit names
(Stephen Wolfram has made a very good point for them in his book on
Mathematica).

But DISTANCE has been chosen in the original design of ADQL, and I'm
an enemy of changing thing that are not technically broken for
essentially aesthetic reasons.  So, my vote is for keeping DISTANCE.

Cheers,

         Markus


More information about the dal mailing list