Gaia Archive arrays implementation

Juan Carlos Segovia juan.carlos.segovia at sciops.esa.int
Mon Aug 21 08:46:15 CEST 2017


Dear All,

This mail is to explain how arrays are going to be handled in the Gaia 
archive TAP service for DR2.

We can consider to modify the approach in the future if TAP finally 
supports this data type but, in our view, the current approach is 
already providing the basic support required.

In Gaia we are going to work with arrays of multiple dimensions. It 
means we need something like (VOTable syntax) 'AxBxC...[x*]' (e.g. 
'2x3x*') as the size of an array column.

In the current stable specification (1.0), TAP_SCHEMA.columns.size is an 
integer. This value cannot be mapped to the required needs.

Also, the current stable specification assigns to 
TAP_SCHEMA.columns.size, for numeric types with value > 1, a database 
type VARBINARY.

We tried to be compatible with the current specification. So we have 
decided to assign VARBINARY type for all numeric array types. Note that 
as the data type VARBINARY is assigned, you loose the original data type.

In order to be compatible, we have added two new columns to 
TAP_SCHEMA.columns table:

array_dim (dimensions)
array_type (numeric data type)

So, for instance, for an integer array of 4x5, we have the following 
TAP_SCHEMA.column row (only involved columns are shown):

datatype: 'VARBINARY'
size: null
array_dim: '4x5'
array_type: 'int'

(There are some problems with chars, so we finally decided to avoid char 
arrays.)

Please, find the attached document with a basic mapping for our types. 
Only integer, short, char and unsignedByte are shown, but all numeric 
types (short, int, long, float, double) are handled in the same way. 
Please, take in mind that the last column, database data type, can be 
ignored as this is an implementation detail.

The implementation we have done is quite similar to the specified in 
VOTable 1.1 working draft (main difference: VARBINARY for arrays data type).

When retrieving data, the VOTable will contain the right array 
specification as they are extracted from the two extra columns: 
array_dim and array_type.

In order to access an specific element (and until the ADQL syntax allows 
a direct access) we have implemented a database function named 
get_array_double_element (only for doubles) and can be used as follows 
(more information at http://gea.esac.esa.int/archive-help/index.html, 
section "ADQL syntax"):


SELECT get_double_array_element('{{1.0,2.6},{0.8,0.1}}','[1][2]') from 
public.dual


Best regards,
Juan-Carlos Segovia.

-- 
Serco for ESA - European Space Agency

Juan-Carlos Segovia Serrato

ESAC Science Data Centre
Data and Engineering Division
Operations Department, Directorate of Science
European Space Astronomy Centre (ESAC)
European Space Agency (ESA)

email: juan.carlos.segovia at sciops.esa.int
Phone: (34)-91-8131-175 - 70175 internal
Fax:   (34)-91-8131-308

European Space Astronomy Centre (ESAC)
Camino Bajo del Castillo s/n
Urb. Villafranca del Castillo
28692 Villanueva de la Cañada, Madrid, Spain.


This message and any attachments are intended for the use of the addressee or addressees only.
The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its
content is not permitted.
If you received this message in error, please notify the sender and delete it from your system.
Emails can be altered and their integrity cannot be guaranteed by the sender.

Please consider the environment before printing this email.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: gaia_archive_array_types.pdf
Type: application/pdf
Size: 41182 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20170821/57bfaa16/attachment-0001.pdf>


More information about the dal mailing list