ADQL XMATCH

Arnold Rots arots at cfa.harvard.edu
Mon Apr 11 21:46:21 CEST 2016


On a different, but somewhat related topic:
Assuming that users will not mix equatorial and Galactic coordinates
in the DISTANCE function, there is indeed no need for a coordinate
system. But the implicit assumption is that the function has spherical
coordinates are input and returns an angular distance.
Woudl it not be better to be explicit about this and call the function
SPHERICAL_DISTANCE or ANGULAR_DISTANCE? I know, it's a bit
of a mounthful, but ANGDISTANCE or ANGDIST would work, too.
By doingh that we are explicit about what it is that we are talking about
and don't allow spherical coordinate systems to lay exclusive claim to
the concept DISTANCE.
And it leaves the door open to later define distances for Cartesian
coordinates or terrestrial coordinates in linear units.

Cheers,

  - Arnold

-------------------------------------------------------------------------------------------------------------
Arnold H. Rots                                          Chandra X-ray
Science Center
Smithsonian Astrophysical Observatory                   tel:  +1 617 496
7701
60 Garden Street, MS 67                                      fax:  +1 617
495 7356
Cambridge, MA 02138
arots at cfa.harvard.edu
USA
http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------------------------------------------


On Mon, Apr 11, 2016 at 1:57 PM, Patrick Dowler <pdowler.cadc at gmail.com>
wrote:

> TL;DR - I think that we should redefine all the geometry functions
> without coord sys now and (since overloading seems to be OK) we can
> keep the old deprecated ones if we have to. Then the 2-arg DISTANCE
> function with point args is my preferred solution. I don't see this
> strictly as syntactic sugar to be used instead of a crafty CONTAINS
> (equiv as a predicate) because the user can also add DISTANCE(...) to
> the select list.
>
> Long version:
>
> While I agree that something like DISTANCE is preferrable to XMATCH
> because it correctly conveys exactly what is going on, I don't like
> the 4-arg version because it foils implementations that have spherical
> geometry indexing, or at least makes them really messy with new
> failure modes:
>
> We have several catalogues in our TAP service with the coordinates in
> a column described with xtype="adql:POINT" (lets ignore the details of
> the adql prefix for now).  If the query on those tables uses that
> column, the relevant indexing comes into play. It is true that the
> tables also have separate RA and DEC columns and in principle I could
> detect DISTANCE(RA, DEC, uploaded.c1, uploaded.c2) and replace RA, DEC
> with the POS column, but what do I do if:
>
> - query refers to the wrong columns in the table (e.g. DISTANCE(foo,
> bar, uploaded.c1, uploaded.c2)
> - query just gets them in the wrong order (e.g. DISTANCE(DEC, RA,
> uploaded.c1, uploaded.c2)
>
> I would be inclined to have the job fail rather than run it. It makes
> me wonder why we would make the user put the two coordinates together
> (and possibly make mistakes) when I have already put them together
> correctly for them.
>
> On the other hand, if a service has a table with just the RA and DEC
> columns they can still advertise in the TAP_SCHEMA that they have a
> POS column, and then when they see DISTANCE(POS, ...) they can easily
> replace POS with whatever reference to RA and DEC are correct and
> optimal. It is always easier to expand a single symbol into the
> internal implementation than to go the other way. Sure, upload tables
> may have a column with point(s) or separate columns with coordinates,
> so with DISTANCE(<point>, <point>) one would typically write
>
> DISTANCE(POS, POINT(uploaded,c1, upload.c2))
>
> A 2-arg DISTANCE function and services declaring a POS column (maybe
> instead of RA and DEC) are adding value and making it easier for the
> user. A 4-arg DISTANCE function makes adding value impossible and
> introduces ways to make essentially incorrect queries (admittedly,
> there are plenty of ways to do that already :-).
>
> So, I am a fan of the 2-arg DISTANCE but not of the { } syntax, which
> strikes me as non-SQL. In PG, for example, you can write geometry in
> internal syntax like that but (i) is has to be a string and (ii) you
> almost always have to provide a cast to get the value you want. Worst
> case is that users have to write DISTANCE(POINT(c1, c2), POINT(c3,
> c4)) if the service/implementation doesn't provide the added value
> necessary.
>
> PS-Yes, I means exactly that POINT function with 2 args. We already
> realised a long time ago that  including the coordinate system in the
> functions was a huge mistake and since then we have been working to
> remove that (eg SIA-2.0 and DALI-1.1 do not include it and DALI
> defines point exactly like above, and the next TAP revision will be
> consistent with that). I personally think that we should just redefine
> all the geometry functions without coord sys now and (since
> overloading seems to be OK) we can keep the old deprecated ones if we
> have to.
>
>
>
> --
> Patrick Dowler
> Canadian Astronomy Data Centre
> Victoria, BC, Canada
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20160411/b1903dfa/attachment.html>


More information about the dal mailing list