# ADQL 2.1: Preferred crossmatch syntax

Tom McGlynn (NASA/GSFC Code 660.1) tom.mcglynn at nasa.gov
Thu Nov 9 17:51:46 CET 2017

I don't understand DBMS's well enough to comment on what the best syntax is now or in the future.
However I would like to suggest that we may wish to provide more examples using whatever is selected
and that
it might be helpful to compare whatever the Xmatch syntax is to the syntax used to match a table to a
single position.

Specific examples I'd include:
Anticorrelations: where position in the first table is not found in the second.
Three way joins by position
Two way match by position with fixed matching radius
Two way  position with variable matching radius

And as mentioned above I'd include a comparison with a simple table query for a single position as a
comparison since
ideally that would help the reader understant how the xmatch is built.

In general I think more examples is a good thing....

Tom

Mark Taylor wrote:
> On Mon, 6 Nov 2017, Markus Demleitner wrote:
>
>> Hi DAL,
>>
>> On Fri, Nov 03, 2017 at 09:26:59PM +0000, Mark Taylor wrote:
>>> On Fri, 3 Nov 2017, Francois-Xavier PINEAU wrote:
>>>> Although I agree with the approach, there is one point that bothers me.
>>>>
>>>> I know that not everyone agrees on the matter, but I personally
>>>> prefer to declare the cross-match condition in the JOIN (like in
>>>> your InteropOct2015DAL presentation) rather than in the WHERE.
>>>>
>>>> In fact, in relational algebra, I prefer to see the cross-match as a
>>>> theta-join, not as a selection on the result of a cross-product.
>> +1 to FX' proposal; it's a good thing to be explicit on what's a join
>> condition and what's a constraint on the resulting relation.
> Sounds reasonable enough to me, so unless anybody speaks up against,
> let's go with that instead.
>
> So if we drop the business about order of positions in the DISTANCE
> evaluation, and prefer the condition in the JOIN rather than the
> WHERE clause (along with a few minor wording changes that have
> occurred to me in the mean time) we have:
>
>     \subsection{Preferred Crossmatch Syntax}
>
>     An especially common operation that astronomers require when working
>     with source catalogues is the positional sky crossmatch.
>     In its simplest form this is a join between two tables with the
>     requirement that the distance along a great circle between the
>     sky positions of the two associated rows is less than or equal to
>     a given threshold.
>
>     The geometrical functions provided by ADQL offer a number of
>     semantically equivalent ways to specify such a condition in
>     the JOIN or WHERE clause, for instance using various
>     combinations of POINT, CIRCLE and DISTANCE.
>     While a correct implementation will generate the same result for
>     any of these specifications, the performance characteristics may
>     differ dramatically depending on implementation.
>     Given this, it is difficult for (human or machine) ADQL authors
>     to know how to phrase a crossmatch with the expectation that it
>     will be executed efficiently, and difficult for services to know
>     which forms of query to optimise.  The result can be the
>     unnecessarily slow operation of the common sky crossmatch operation.
>
>     The purpose of this section is to propose a preferred form of ADQL
>     to use for sky crossmatches.  Clients posing crossmatch-like
>     queries are advised to phrase them like this rather than semantically
>     equivalent alternatives, and services are encouraged to ensure that
>     this form of join is executed efficiently; this might involve
>     identifying such ADQL input clauses and rewriting them appropriately
>     for efficient processing on the database backend.
>
>     The preferred way to specify a sky position-only crossmatch is:
>     \begin{verbatim}
>        JOIN ... ON DISTANCE(lon1, lat1, lon2, lat2) < r_max_deg
>     \end{verbatim}
>     or equivalently
>     \begin{verbatim}
>        JOIN ... ON DISTANCE(POINT(lon1, lat1), POINT(lon2, lat2)) < r_max_deg
>     \end{verbatim}
>
>     Alternative semantically equivalent forms however MAY still be
>     used by clients, and MUST still be handled correctly by services.
>
> Any new or old objections, please post them.
>
> Mark
>
> --
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/