Array element access in ADQL
Mark Taylor
M.B.Taylor at bristol.ac.uk
Tue Jul 4 13:43:07 CEST 2017
Dear DAL,
Upcoming releases of the Gaia catalogue will contain some
array-valued columns in the source catalogues, things like
time series, spectra and correlation matrices.
TAP does not prevent array-valued columns, but as far as
I know there is no standard way to access array elements
within ADQL queries. In DPAC we are considering ways to allow this.
We can define User-Defined Functions for this purpose, and
some experimental functionality along these lines has been
implemented. But we're interested in input from the IVOA:
- have other people encountered this and come up with solutions
that we can copy?
- should we try to come up with something (a de facto standard)
that can be used by other services?
- is there a case for language support for these features in
a future version of ADQL?
Here is the initial discussion item reported by Alcione Mora from
the DPAC issue tracking system (ref for DPAC insiders: C9GACS-239):
Experimental support for array functions has been added to Gaia Archive
v1.3.0. Array types are supported as valid output formats. In addition,
some user defined functions have been defined for direct manipulation,
most notably (see Archive help):
GET_DOUBLE_ARRAY_ELEMENT(array,indexes): Returns the selected element
from the array of double precision values, where:
- array [double]:
Input array
- indexes [string]:
String with the selected indexes with the format '[i][j]..'
The syntax is functional, but should be considered a work in progress
until DR2.
Extra functionality is needed Some suggestions to discuss include
the following:
- get_length(array_column, any type), null for null input cell
- create_double_array(list_of_columns), autocasting
- create_int_array(list_of_columns), autocasting
- Is ADQL function overload supported? I could not find
any reference in ADQL and TAPRegExt (neither allowing nor
forbidding). If yes, GET_ARRAY_ELEMENT should be implemented using
a single function name for all data types transparently to the
user. If not, consider some of the following six bullet points
- Similar functions for float, long, int, byte and boolean
get_double and get_float should accept all numeric types as input
(automatic casting)
- get_long should accept long, int, byte and boolean (1=true, 0=false)
- get_int should accept int and byte and boolean (1=true, 0=false)
- get_byte should accept byte and boolean (1=true, 0=false)
- Alternatively or complementary, the CAST ADQL 2.1 optional
function could be implemented
If anyone in VO-land has input on this, we'd be pleased to hear it,
so we can do something re-usable or re-used as much as possible.
Otherwise, we'll go ahead and do whatever looks like a best fit
to Gaia requirements.
Thanks
Mark
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/
More information about the dal
mailing list