Array element access in ADQL

Mark Taylor M.B.Taylor at bristol.ac.uk
Tue Jul 4 13:43:07 CEST 2017


Dear DAL,

Upcoming releases of the Gaia catalogue will contain some
array-valued columns in the source catalogues, things like
time series, spectra and correlation matrices.
TAP does not prevent array-valued columns, but as far as
I know there is no standard way to access array elements
within ADQL queries.  In DPAC we are considering ways to allow this.

We can define User-Defined Functions for this purpose, and
some experimental functionality along these lines has been
implemented.  But we're interested in input from the IVOA:

   - have other people encountered this and come up with solutions
     that we can copy?

   - should we try to come up with something (a de facto standard)
     that can be used by other services?

   - is there a case for language support for these features in
     a future version of ADQL?

Here is the initial discussion item reported by Alcione Mora from
the DPAC issue tracking system (ref for DPAC insiders: C9GACS-239):

   Experimental support for array functions has been added to Gaia Archive
   v1.3.0. Array types are supported as valid output formats. In addition,
   some user defined functions have been defined for direct manipulation,
   most notably (see Archive help):

   GET_DOUBLE_ARRAY_ELEMENT(array,indexes): Returns the selected element
   from the array of double precision values, where:

      - array [double]:
           Input array
      - indexes [string]:
           String with the selected indexes with the format '[i][j]..'

   The syntax is functional, but should be considered a work in progress
   until DR2.

   Extra functionality is needed Some suggestions to discuss include
   the following:

      - get_length(array_column, any type), null for null input cell
      - create_double_array(list_of_columns), autocasting
      - create_int_array(list_of_columns), autocasting
      - Is ADQL function overload supported? I could not find
        any reference in ADQL and TAPRegExt (neither allowing nor
        forbidding). If yes, GET_ARRAY_ELEMENT should be implemented using
        a single function name for all data types transparently to the
        user. If not, consider some of the following six bullet points
      - Similar functions for float, long, int, byte and boolean
        get_double and get_float should accept all numeric types as input
        (automatic casting)
      - get_long should accept long, int, byte and boolean (1=true, 0=false)
      - get_int should accept int and byte and boolean (1=true, 0=false)
      - get_byte should accept byte and boolean (1=true, 0=false)
      - Alternatively or complementary, the CAST ADQL 2.1 optional
        function could be implemented

If anyone in VO-land has input on this, we'd be pleased to hear it,
so we can do something re-usable or re-used as much as possible.
Otherwise, we'll go ahead and do whatever looks like a best fit
to Gaia requirements.

Thanks

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/


More information about the dal mailing list