Simplified ADQL

Walter Landry wlandry at caltech.edu
Thu Jun 4 00:52:31 CEST 2015


Hi Everyone,

I would like to float a proposed simplification of ADQL that would
exist alongside the current ADQL 2.0.  The primary motivation comes
from a series of tables (Planck TOI's) that are too large to fit in a
traditional database.  We instead store them in a data store
specialized for geometric searches (tinyhtm).  To query it, we first
select rows using a geometric constraint.  We put those rows into
sqlite, and then run any complex SQL on that sqlite table.

That means that, in order to serve these tables through TAP, we
restrict the ADQL syntax.  Specifically, we do not allow geometric
functions anywhere except in a separate clause in the where clause.

As it turns out, that kind of restricted query is also useful for our
other tables.  We do not implement geometry types in the
database.  Rather, we store an HTM number (similar to a Healpix
number) and x, y, z coordinates in the database.  Then we convert the
geometry expression into one involving involving only HTM, x, y,
and z.  So if we have the geometric constraint

  CONTAINS(POINT('J2000',ra,dec),CIRCLE('J2000',66.76957,26.10453,0.01))=1

we compute (x,y,z) on the unit sphere corresponding to
(66.76957,26.10453).  The radius (0.01) is turned into a distance in 3D

  r=sin(0.01 * pi/180)

Then we rewrite the geometric constraint into something like

  HTM > 100 AND HTM < 300
    AND (table.x-x)*(table.x-x) + (table.y-y)*(table.y-y) + (table.z-z)*(table.z-z) < r*r

I have listed the changes in the ADQL BNF below.  We have implemented
these modifications, so if there is some question as to whether a
particular syntax will work, I can try it out.  This is not yet meant
as a formal proposal.  I am just trying to get an idea of whether it
would be possible to make this kind of update to ADQL.

For those who are wondering, this does not support cross matching.  We
do support cross matching in the backend.  So if there is interest, I
could probably come up with a syntax.  But we have not implemented a
full TAP pipeline through to the backend, so it would be more speculative.

Thanks,
Walter Landry


1) The first table in the top level FROM clause must be a real table.
   You can not immediately jump into a subquery.

2) Change the clause in the top level SELECT statement from

    [ WHERE search_condition ]

   to

    [ WHERE ((geometry [ AND search_condition ])
             | (search_condition [ AND geometry ] ))]

3) Remove the geometry functions from <string_value_function>,
<value_expression>, and <numeric_value_function>.  This changes

  <string_value_function> ::= <string_geometry_function> |
    <user_defined_function>

  <value_expression> ::=
    <numeric_value_expression>
    | <string_value_expression>
    | <geometry_value_expression>

  <numeric_value_function> ::=
    <trig_function>
    | <math_function>
    | <numeric_geometry_function >
    | <user_defined_function>

to

  <string_value_function> ::= <user_defined_function>

  <value_expression> ::=
    <numeric_value_expression>
    | <string_value_expression>

  <numeric_value_function> ::=
    <trig_function>
    | <math_function>
    | <user_defined_function>


4) Change the coordinates to be explicit numbers, never the result of
   an expression.

    <coord> ::= <number> <comma> <number>;

  Where <number> is something like

    <number> ::= [sign] <unsigned_numeric_literal>

  Similarly, make the circle and box sizes all numbers

    <circle> ::= CIRCLE <left_paren> <coord_sys> <comma> <coord> <comma>
                 <number> <right_paren>

    <box> %= BOX <left_paren> <coord_sys> <comma> <coord> <comma>
                 <number> <comma> <number> <right_paren>


More information about the dal mailing list