ADQL DISTANCE argument?

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Feb 19 16:14:43 CET 2020


Dear DAL,

Current ADQL says:

  Functions like AREA, COORD1, COORD2 and DISTANCE accept a
  geometry and return a calculated numeric value.

  The specification defines two versions of the DISTANCE function, one
  that accepts two geometries, and one that accepts four separate numeric
  values, both forms return a numeric value.

Both statements would indicate that DISTANCE should accept general
geometries, i.e., including circles and polygons.  

The later definition of DISTANCE then says

  The specification defines two versions of the DISTANCE function,
  one that accepts two POINT values, and a second that accepts four
  separate numeric values.

-- which is clear enough, had it not been for the previous statement,
and the later statement

  If the geometric arguments are expressed ...

which might again be understood as saying the arguments can be more
general.

Finally, the grammar says, for the geometry case:

  DISTANCE <left_paren> <coord_value> <comma> 
    <coord_value> <right_paren>

where

  <coord_value> ::= <point> | <column_reference>

I *think* all this works out to say that over and above the grammar,
for distance there's the additional constraint that column_reference
must be POINT-typed.[1]  

Being general here is a pain in the neck (actually, that's why I ran
into this question).  For one, you'll need to define distance
much more carefully for such geometries, and if (as I think we ought
to) we chose "minimum of distances of between all points in arg 1 and
arg 2", I doubt we'll see many correct implementations of that.  Also
I'll want to map a lot of DISTANCE calls into contains(point,
circle) statements (because that's much easier on the query planner),
and that's a pain if one of the points could actually be, say, a
polygon.

So... do we agree that DISTANCE only accept POINT-s?

If so, I'd suggest to just drop the sentence:

  Functions like AREA, COORD1, COORD2 and DISTANCE accept a
  geometry and return a calculated numeric value.

Then change

  The specification defines two versions of the DISTANCE function,
  one that accepts two geometries, and one that accepts four
  separate...

to

  This specification defines two versions of the DISTANCE function,
  one that accepts two POINTs, and one that accepts four
  separate...

And then add in 4.2.16 in some appropriate location something like

  Note that when <column reference>s[2] are passed into DISTANCE, the
  operation is only defined for POINT-typed values.  Behaviour for
  other geometries is undefined at this point (but may be defined
  later).

Would anyone veto a PR to this effect?  Would anyone prefer something
completely different?  Would anyone volunteer for doing the PR?

           -- Markus

[1] Incidentally, the grammar rules are incompatible with the
statement in the 4.2.16 that "[t]he DISTANCE function may be applied
to any expression that returns a geometric POINT value"; I see why it
was put in, but unless we fix the grammar, we should remove the
prose.

[2] or <geometry_value_expression>s, depending on how you think about
[1]


More information about the dal mailing list