Simplified ADQL

John Good jcg at ipac.caltech.edu
Thu Jun 4 07:24:17 CEST 2015


This general approach was proposed back in 2001 at the beginning
of the US NVO (by me) and I think it is high time to revisit it.  We
implemented HTM indexing for the 2MASS project back in 1997
with the spatial constraints decoupled from the other relational
constraints (though all done inside the DBMS) and that is still the way
most IRSA/Kepler/KOA/LBTI/LCOGT/etc. queries are done.  In all
that time,  I have yet to hear a user complaint regarding inadequate
sky search functionality.

More recently, we have implemented SIA searches with relational
filtering (for the non-spatial parameters) in much the same way
described here, though using an R-Tree index as that is more
effective for extended objects like images.

In practice, the improvement in flexibility achieved by folding spatial
constraints in with the relational ones is in fact an illusion while the
increased complexity of the coding (and probably query speed
degradation) is very real.

- John


On 6/3/15 3:52 PM, Walter Landry wrote:
> Hi Everyone,
>
> I would like to float a proposed simplification of ADQL that would
> exist alongside the current ADQL 2.0.  The primary motivation comes
> from a series of tables (Planck TOI's) that are too large to fit in a
> traditional database.  We instead store them in a data store
> specialized for geometric searches (tinyhtm).  To query it, we first
> select rows using a geometric constraint.  We put those rows into
> sqlite, and then run any complex SQL on that sqlite table.
>
> That means that, in order to serve these tables through TAP, we
> restrict the ADQL syntax.  Specifically, we do not allow geometric
> functions anywhere except in a separate clause in the where clause.
>
> As it turns out, that kind of restricted query is also useful for our
> other tables.  We do not implement geometry types in the
> database.  Rather, we store an HTM number (similar to a Healpix
> number) and x, y, z coordinates in the database.  Then we convert the
> geometry expression into one involving involving only HTM, x, y,
> and z.  So if we have the geometric constraint
>
>    CONTAINS(POINT('J2000',ra,dec),CIRCLE('J2000',66.76957,26.10453,0.01))=1
>
> we compute (x,y,z) on the unit sphere corresponding to
> (66.76957,26.10453).  The radius (0.01) is turned into a distance in 3D
>
>    r=sin(0.01 * pi/180)
>
> Then we rewrite the geometric constraint into something like
>
>    HTM > 100 AND HTM < 300
>      AND (table.x-x)*(table.x-x) + (table.y-y)*(table.y-y) + (table.z-z)*(table.z-z) < r*r
>
> I have listed the changes in the ADQL BNF below.  We have implemented
> these modifications, so if there is some question as to whether a
> particular syntax will work, I can try it out.  This is not yet meant
> as a formal proposal.  I am just trying to get an idea of whether it
> would be possible to make this kind of update to ADQL.
>
> For those who are wondering, this does not support cross matching.  We
> do support cross matching in the backend.  So if there is interest, I
> could probably come up with a syntax.  But we have not implemented a
> full TAP pipeline through to the backend, so it would be more speculative.
>
> Thanks,
> Walter Landry
>
>
> 1) The first table in the top level FROM clause must be a real table.
>     You can not immediately jump into a subquery.
>
> 2) Change the clause in the top level SELECT statement from
>
>      [ WHERE search_condition ]
>
>     to
>
>      [ WHERE ((geometry [ AND search_condition ])
>               | (search_condition [ AND geometry ] ))]
>
> 3) Remove the geometry functions from <string_value_function>,
> <value_expression>, and <numeric_value_function>.  This changes
>
>    <string_value_function> ::= <string_geometry_function> |
>      <user_defined_function>
>
>    <value_expression> ::=
>      <numeric_value_expression>
>      | <string_value_expression>
>      | <geometry_value_expression>
>
>    <numeric_value_function> ::=
>      <trig_function>
>      | <math_function>
>      | <numeric_geometry_function >
>      | <user_defined_function>
>
> to
>
>    <string_value_function> ::= <user_defined_function>
>
>    <value_expression> ::=
>      <numeric_value_expression>
>      | <string_value_expression>
>
>    <numeric_value_function> ::=
>      <trig_function>
>      | <math_function>
>      | <user_defined_function>
>
>
> 4) Change the coordinates to be explicit numbers, never the result of
>     an expression.
>
>      <coord> ::= <number> <comma> <number>;
>
>    Where <number> is something like
>
>      <number> ::= [sign] <unsigned_numeric_literal>
>
>    Similarly, make the circle and box sizes all numbers
>
>      <circle> ::= CIRCLE <left_paren> <coord_sys> <comma> <coord> <comma>
>                   <number> <right_paren>
>
>      <box> %= BOX <left_paren> <coord_sys> <comma> <coord> <comma>
>                   <number> <comma> <number> <right_paren>



More information about the dal mailing list