<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
Dear DAL,<br>
<br>
In the goal to make ADQL-2.1 *finally* released, I would say that we
keep version of DISTANCE between 2 points, instead of between two
any other geometries. This latter, as Markus said, would require
more careful definitions of what should be the distance between, for
instance, a polygon and a circle (should it be between their
"centroid" or the closest distance between boundaries of each
geometry? ; this is a debate that, I think is out of scope for
ADQL-2.1).<br>
<br>
However, as Pat commented, CENTROID should be allowed as valid
argument of DISTANCE. This should not cost much to add that in the
grammar. Besides, it would *indirectly* allow the computation
between two geometries by writing something like: DISTANCE(
CENTROID(POLYGON(....)), CENTROID(CIRCLE(...)) ).<br>
<br>
About the version of DISTANCE with 4 numeric arguments, I am not
especially in favor or against it. As Ger pointed it out, it is just
syntactic sugar, which, as Markus said, may introduce a bit a
complexity, and so of bugs...but I can not really anticipate which
ones. So, I fairly neutral on this point.<br>
<br>
To sum up my thoughts:<br>
<br>
<i>(for more readability here, I did not replace the parenthesis and
comma with their BNF equivalent)</i><br>
<tt>---------------------------------------------------------------------</tt><br>
<tt><distance> ::=<br>
DISTANCE(<coord_value>, <coord_value>)</tt><tt><br>
</tt><tt> | DISTANCE(<numeric_value_expression>, </tt><tt><numeric_value_expression>,<br>
</tt><tt><numeric_value_expression>, </tt><tt><numeric_value_expression>)</tt><tt><br>
<br>
</tt><tt><coord_value> ::= <point_value> |
<column_reference></tt><tt><br>
</tt><br>
<tt><point_value> ::= <point> | <centroid></tt><br>
<tt>---------------------------------------------------------------------</tt><br>
<br>
I am aware that adding <centroid> into <point_value> has
not an impact only on <distance>, but I looked in other places
where it is used and I do not see why it would be inappropriate or
error prone. Just tell me if it does.<br>
<br>
I can start a GitHub's PR with these and the suggestions of Markus,
if you want to.<br>
<br>
Cheers,<br>
Grégory<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 19/02/2020 16:14, Markus Demleitner
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:20200219151443.wflvadakrkreoqxw@victor">
<pre class="moz-quote-pre" wrap="">Dear DAL,
Current ADQL says:
Functions like AREA, COORD1, COORD2 and DISTANCE accept a
geometry and return a calculated numeric value.
The specification defines two versions of the DISTANCE function, one
that accepts two geometries, and one that accepts four separate numeric
values, both forms return a numeric value.
Both statements would indicate that DISTANCE should accept general
geometries, i.e., including circles and polygons.
The later definition of DISTANCE then says
The specification defines two versions of the DISTANCE function,
one that accepts two POINT values, and a second that accepts four
separate numeric values.
-- which is clear enough, had it not been for the previous statement,
and the later statement
If the geometric arguments are expressed ...
which might again be understood as saying the arguments can be more
general.
Finally, the grammar says, for the geometry case:
DISTANCE <left_paren> <coord_value> <comma>
<coord_value> <right_paren>
where
<coord_value> ::= <point> | <column_reference>
I *think* all this works out to say that over and above the grammar,
for distance there's the additional constraint that column_reference
must be POINT-typed.[1]
Being general here is a pain in the neck (actually, that's why I ran
into this question). For one, you'll need to define distance
much more carefully for such geometries, and if (as I think we ought
to) we chose "minimum of distances of between all points in arg 1 and
arg 2", I doubt we'll see many correct implementations of that. Also
I'll want to map a lot of DISTANCE calls into contains(point,
circle) statements (because that's much easier on the query planner),
and that's a pain if one of the points could actually be, say, a
polygon.
So... do we agree that DISTANCE only accept POINT-s?
If so, I'd suggest to just drop the sentence:
Functions like AREA, COORD1, COORD2 and DISTANCE accept a
geometry and return a calculated numeric value.
Then change
The specification defines two versions of the DISTANCE function,
one that accepts two geometries, and one that accepts four
separate...
to
This specification defines two versions of the DISTANCE function,
one that accepts two POINTs, and one that accepts four
separate...
And then add in 4.2.16 in some appropriate location something like
Note that when <column reference>s[2] are passed into DISTANCE, the
operation is only defined for POINT-typed values. Behaviour for
other geometries is undefined at this point (but may be defined
later).
Would anyone veto a PR to this effect? Would anyone prefer something
completely different? Would anyone volunteer for doing the PR?
-- Markus
[1] Incidentally, the grammar rules are incompatible with the
statement in the 4.2.16 that "[t]he DISTANCE function may be applied
to any expression that returns a geometric POINT value"; I see why it
was put in, but unless we fix the grammar, we should remove the
prose.
[2] or <geometry_value_expression>s, depending on how you think about
[1]
</pre>
</blockquote>
<br>
</body>
</html>