ADQL XMATCH

Tue Apr 12 19:26:50 CEST 2016

So we are talking about keeping the existing DISTANCE(point, point)
and overloading it with a 4-arg version? I can live with that.

As for the failing queries, yeah: I wouldn't really do that because
there are already countless ways to write incorrect queries I can't
detect...

On 12 April 2016 at 00:23, Markus Demleitner
<msdemlei at ari.uni-heidelberg.de> wrote:
> On Mon, Apr 11, 2016 at 10:57:16AM -0700, Patrick Dowler wrote:
>> We have several catalogues in our TAP service with the coordinates in
>> a column described with xtype="adql:POINT" (lets ignore the details of
>> the adql prefix for now).  If the query on those tables uses that
>> column, the relevant indexing comes into play. It is true that the
>> tables also have separate RA and DEC columns and in principle I could
>> detect DISTANCE(RA, DEC, uploaded.c1, uploaded.c2) and replace RA, DEC
>> with the POS column, but what do I do if:
>>
>> - query refers to the wrong columns in the table (e.g. DISTANCE(foo,
>> bar, uploaded.c1, uploaded.c2)
>> - query just gets them in the wrong order (e.g. DISTANCE(DEC, RA,
>> uploaded.c1, uploaded.c2)
>>
>> I would be inclined to have the job fail rather than run it. It makes
>> me wonder why we would make the user put the two coordinates together
>
> Let me support Mark's dislike for failing executable ADQL queries --
> there's the old saying that Unix doesn't keep you from doing stupid
> things because that would keep you from doing clever things.  I've
> found that to be true on several occasions, and I believe the same
> (should be) true of query languages.
>
> As to point arguments or split arguments: I'm sure most current
> astronomers will ask for the split-argument version, if only because,
> as Mark points out, almost all current tables are written like that.
> I don't see query morphing to be *much* more difficult in either
> direction, so that's probably not a big deal either.
>
> Grammatically, it would not be a big deal to support both; right now,
> the rule for distance is:
>
>  <distance> ::=
>      DISTANCE <left_paren> <coord_value> <comma> <coord_value> <right_paren>
>
> and
>
>   <coord_value> ::= <point> | <column_reference>
>
> If we wrote
>
>   <coord_value> ::= <point> | <column_reference> | <coordinates>
>
> we'd be done (in the unlikely case you don't have the ADQL grammar in
> front of you, <coordinates> is
>
>   <coordinates> ::=  <coordinate1>  <comma>  <coordinate2>).
>
> coord_value is otherwise only used in COORD1 and COORD2, so there are
> few side effects.  Until I've actually written the code, I can't make
> promises, but my feeling is that this should be no big deal in actual
> implementation either.
>
> On Arnold's point this should have been ANGULAR_DISTANCE: That's
> probably true, and I'm always on the side of long, explicit names
> (Stephen Wolfram has made a very good point for them in his book on
> Mathematica).
>
> But DISTANCE has been chosen in the original design of ADQL, and I'm
> an enemy of changing thing that are not technically broken for
> essentially aesthetic reasons.  So, my vote is for keeping DISTANCE.
>
> Cheers,
>
>          Markus

-- 
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada