ADQL XMATCH
Laurent Michel
laurent.michel at astro.unistra.fr
Wed Feb 10 13:26:31 CET 2016
Hi,
I see multiple benefits to use
distance(ra1,dec1,ra2,dec2) < something
1) That is intuitive for users
2) The operation actually done is unambiguous. That wouldn't be the case with a function named Xmatch for instance since doing a
cross match can refer to a large panel of algorithms or processing
3) It gets rid of the pseudo boolean operator.
4) It is quite flexible: no constraint on both operator and operand (e.g. < something, > somethingelse, = function(anything)...
are valid)
5) One can also point out that this function could express a simple cone search in a readable form
e.g. distance(ra1,dec1,12.7,-13.8) < 10
But: I think there is a conflict with the DISTANCE(POINT,POINT) function since ADQL functions can not be overloaded. Is that right?
Le 10/02/2016 10:14, Mark Taylor a écrit :
> On Tue, 9 Feb 2016, Tom McGlynn (NASA/GSFC Code 660.1) wrote:
>
>> If I understand it, the xmatch function proposed here will return a 1 or 0
>> based purely upon two positions and a radius. Presumably the function returns
>> 1 if the two positions are within the specified radius of each other and 0
>> otherwise, but maybe something else has been discussed. No other information
>> is used in the xmatch.
>
> yes.
>
>> In that context I would prefer to use four real variables so that coordinate
>> system is irrelevant. If there are functions that can create point objects
>> from coordinates or get the coordinates from point objects, then the two
>> approaches are equivalent in the functionality they provide to users, but
>> using reals is simpler to implement since we can simply assume that whatever
>> the coordinate system is, all four of the values are using the same one.
I'm a bit disturbed by this sentence: At the level of the language definition, we can not "assume that all four values are using
the same coordinate system". Making this assumption true is the responsibility of the query author. The distance() function must
be CooSys neutral in a sense of it just computes the distance between the 2 points given by the parameters and without
consideration to their frames.
In the case of matching 2 tables with different frames, the ADQL distance(POINT, POINT) should be used indeed.
Cheers
Laurent
>
> agree.
>
>> However I think this overall approach is flawed. If we want to create a
>> logical function then we should do that. In our implementations of existing
>> geometry functions at the HEASARC we've found that Postgres query optimizer
>> is confused by this idiom where we use a
>> function() = integer
>> substitution for a logical value. If you really want logical values, then I
>> think we should just implement xmatch that way. Of course given that we've
>> already implemented other functions this way that's probably a boat that's
>> already sailed.
>
> A logical function certainly makes more sense here; 1=XMATCH(..) is
> clunky and unintuitive. However, as I understand it, there is no
> logical type defined in ADQL, so it's not possible to define a new
> function like that without significant changes to the ADQL syntax.
>
>> However if the xmatch function is doing what I indicated above, I think the
>> whole function is superfluous.
>>
>> Rather than
>> (xmatch(ra1,dec1,ra2,dec2,rad) = 1)
>> or if we use a logical value
>> (xmatch(ra1,dec1,ra2,dec2,rad))
>>
>> it seems far more natural to use
>> (distance(ra1,dec1,ra2,dec2) < rad)
>>
>> This is clear and easily implemented. In our experience it can be translated
>> into functions that can take advantage of spatial indices.
>
> From a user point of view, I think that would be absolutely fine;
> in fact as you say better than a dedicated XMATCH function
> because it's more transparent and more flexible. I was under
> the impression that constraints written like that were difficult
> for TAP implementors to use in a way that led to an efficient
> crossmatch, and that the 1=CONTAINS(POINT,CIRCLE) business was
> the recommended way to specify a performant crossmatch in ADQL.
> However, I don't know anywhere that's written down in a standard,
> and maybe I'm just wrong about it. I'm not at all knowledgeable
> about the DB end of this, so I'm largely in the dark about what
> makes sense here from an implementation point of view.
>
> My interest is that I want to be able to write example ADQL
> queries and provide documentation to my ADQL-using users that
> tell them how to perform a spatial crossmatch on the sky,
> without too much ugly syntax.
>
> Mark
>
> --
> Mark Taylor Astronomical Programmer Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/
>
--
jesuischarlie
Laurent Michel
SSC XMM-Newton
Tél : +33 (0)3 68 85 24 37
Fax : +33 (0)3 )3 68 85 24 32
laurent.michel at astro.unistra.fr <mailto:laurent.michel at astro.unistra.fr>
Université de Strasbourg <http://www.unistra.fr>
Observatoire Astronomique
11 Rue de l'Université
F - 67200 Strasbourg
http://amwdb.u-strasbg.fr/HighEnergy/spip.php?rubrique34
More information about the dal
mailing list