ADQL DISTANCE argument?
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Wed Feb 19 16:14:43 CET 2020
Dear DAL,
Current ADQL says:
Functions like AREA, COORD1, COORD2 and DISTANCE accept a
geometry and return a calculated numeric value.
The specification defines two versions of the DISTANCE function, one
that accepts two geometries, and one that accepts four separate numeric
values, both forms return a numeric value.
Both statements would indicate that DISTANCE should accept general
geometries, i.e., including circles and polygons.
The later definition of DISTANCE then says
The specification defines two versions of the DISTANCE function,
one that accepts two POINT values, and a second that accepts four
separate numeric values.
-- which is clear enough, had it not been for the previous statement,
and the later statement
If the geometric arguments are expressed ...
which might again be understood as saying the arguments can be more
general.
Finally, the grammar says, for the geometry case:
DISTANCE <left_paren> <coord_value> <comma>
<coord_value> <right_paren>
where
<coord_value> ::= <point> | <column_reference>
I *think* all this works out to say that over and above the grammar,
for distance there's the additional constraint that column_reference
must be POINT-typed.[1]
Being general here is a pain in the neck (actually, that's why I ran
into this question). For one, you'll need to define distance
much more carefully for such geometries, and if (as I think we ought
to) we chose "minimum of distances of between all points in arg 1 and
arg 2", I doubt we'll see many correct implementations of that. Also
I'll want to map a lot of DISTANCE calls into contains(point,
circle) statements (because that's much easier on the query planner),
and that's a pain if one of the points could actually be, say, a
polygon.
So... do we agree that DISTANCE only accept POINT-s?
If so, I'd suggest to just drop the sentence:
Functions like AREA, COORD1, COORD2 and DISTANCE accept a
geometry and return a calculated numeric value.
Then change
The specification defines two versions of the DISTANCE function,
one that accepts two geometries, and one that accepts four
separate...
to
This specification defines two versions of the DISTANCE function,
one that accepts two POINTs, and one that accepts four
separate...
And then add in 4.2.16 in some appropriate location something like
Note that when <column reference>s[2] are passed into DISTANCE, the
operation is only defined for POINT-typed values. Behaviour for
other geometries is undefined at this point (but may be defined
later).
Would anyone veto a PR to this effect? Would anyone prefer something
completely different? Would anyone volunteer for doing the PR?
-- Markus
[1] Incidentally, the grammar rules are incompatible with the
statement in the 4.2.16 that "[t]he DISTANCE function may be applied
to any expression that returns a geometric POINT value"; I see why it
was put in, but unless we fix the grammar, we should remove the
prose.
[2] or <geometry_value_expression>s, depending on how you think about
[1]
More information about the dal
mailing list