ADQL DISTANCE argument?
Gregory MANTELET
gregory.mantelet at astro.unistra.fr
Mon Feb 24 11:20:33 CET 2020
Dear DAL,
In the goal to make ADQL-2.1 *finally* released, I would say that we
keep version of DISTANCE between 2 points, instead of between two any
other geometries. This latter, as Markus said, would require more
careful definitions of what should be the distance between, for
instance, a polygon and a circle (should it be between their "centroid"
or the closest distance between boundaries of each geometry? ; this is a
debate that, I think is out of scope for ADQL-2.1).
However, as Pat commented, CENTROID should be allowed as valid argument
of DISTANCE. This should not cost much to add that in the grammar.
Besides, it would *indirectly* allow the computation between two
geometries by writing something like: DISTANCE( CENTROID(POLYGON(....)),
CENTROID(CIRCLE(...)) ).
About the version of DISTANCE with 4 numeric arguments, I am not
especially in favor or against it. As Ger pointed it out, it is just
syntactic sugar, which, as Markus said, may introduce a bit a
complexity, and so of bugs...but I can not really anticipate which ones.
So, I fairly neutral on this point.
To sum up my thoughts:
/(for more readability here, I did not replace the parenthesis and comma
with their BNF equivalent)/
---------------------------------------------------------------------
<distance> ::=
DISTANCE(<coord_value>, <coord_value>)
| DISTANCE(<numeric_value_expression>, <numeric_value_expression>,
<numeric_value_expression>, <numeric_value_expression>)
<coord_value> ::= <point_value> | <column_reference>
<point_value> ::= <point> | <centroid>
---------------------------------------------------------------------
I am aware that adding <centroid> into <point_value> has not an impact
only on <distance>, but I looked in other places where it is used and I
do not see why it would be inappropriate or error prone. Just tell me if
it does.
I can start a GitHub's PR with these and the suggestions of Markus, if
you want to.
Cheers,
Grégory
On 19/02/2020 16:14, Markus Demleitner wrote:
> Dear DAL,
>
> Current ADQL says:
>
> Functions like AREA, COORD1, COORD2 and DISTANCE accept a
> geometry and return a calculated numeric value.
>
> The specification defines two versions of the DISTANCE function, one
> that accepts two geometries, and one that accepts four separate numeric
> values, both forms return a numeric value.
>
> Both statements would indicate that DISTANCE should accept general
> geometries, i.e., including circles and polygons.
>
> The later definition of DISTANCE then says
>
> The specification defines two versions of the DISTANCE function,
> one that accepts two POINT values, and a second that accepts four
> separate numeric values.
>
> -- which is clear enough, had it not been for the previous statement,
> and the later statement
>
> If the geometric arguments are expressed ...
>
> which might again be understood as saying the arguments can be more
> general.
>
> Finally, the grammar says, for the geometry case:
>
> DISTANCE <left_paren> <coord_value> <comma>
> <coord_value> <right_paren>
>
> where
>
> <coord_value> ::= <point> | <column_reference>
>
> I *think* all this works out to say that over and above the grammar,
> for distance there's the additional constraint that column_reference
> must be POINT-typed.[1]
>
> Being general here is a pain in the neck (actually, that's why I ran
> into this question). For one, you'll need to define distance
> much more carefully for such geometries, and if (as I think we ought
> to) we chose "minimum of distances of between all points in arg 1 and
> arg 2", I doubt we'll see many correct implementations of that. Also
> I'll want to map a lot of DISTANCE calls into contains(point,
> circle) statements (because that's much easier on the query planner),
> and that's a pain if one of the points could actually be, say, a
> polygon.
>
> So... do we agree that DISTANCE only accept POINT-s?
>
> If so, I'd suggest to just drop the sentence:
>
> Functions like AREA, COORD1, COORD2 and DISTANCE accept a
> geometry and return a calculated numeric value.
>
> Then change
>
> The specification defines two versions of the DISTANCE function,
> one that accepts two geometries, and one that accepts four
> separate...
>
> to
>
> This specification defines two versions of the DISTANCE function,
> one that accepts two POINTs, and one that accepts four
> separate...
>
> And then add in 4.2.16 in some appropriate location something like
>
> Note that when <column reference>s[2] are passed into DISTANCE, the
> operation is only defined for POINT-typed values. Behaviour for
> other geometries is undefined at this point (but may be defined
> later).
>
> Would anyone veto a PR to this effect? Would anyone prefer something
> completely different? Would anyone volunteer for doing the PR?
>
> -- Markus
>
> [1] Incidentally, the grammar rules are incompatible with the
> statement in the 4.2.16 that "[t]he DISTANCE function may be applied
> to any expression that returns a geometric POINT value"; I see why it
> was put in, but unless we fix the grammar, we should remove the
> prose.
>
> [2] or <geometry_value_expression>s, depending on how you think about
> [1]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20200224/a4be3612/attachment.html>
More information about the dal
mailing list