Arrays in TAP_SCHEMA

Grégory Mantelet gmantele at ari.uni-heidelberg.de
Wed May 31 16:34:22 CEST 2017


Dear DAL members,

Sorry to come back again with the "array" topic, but I have more and 
more requests for having arrays in my TAP-Library (and, personally, I 
will also need that quite soon) but I do not know how to proceed. Even 
though nothing formally forbids it, there is actually no possibility to 
declare arrays in TAP_SCHEMA...so in a way it is kind of 
preventing/forbidding the usage of arrays if nobody can really know that 
a column is an array.

I have searched in TAP-1.0, the coming TAP-1.1 and in VODataService in 
the hope to find something leading us toward a solution. Here is what I 
found and my related questions:


## In TAP 1.0

In REC-TAP-1.0, two columns of TAP_SCHEMA.columns let specify the type 
of a published column, defined as follows:
         - datatype - "ADQL datatype as in section 2.5"
         - size          - "length of variable length datatypes"

With the following additional description:

         "Data types and how they map to VOTable datatypes are described 
in section 2.5
          above. The “size” gives the length of variable length 
datatypes, for example
          varchar(256); this size does not map to the VOTable arraysize 
attribute when the
          latter specifies the size and shape of a multi-dimensional array."

As written here, "size" does not aim to tell whether the value is a 
scalar or an array ; it is just the N in CHAR(N), VARCHAR(N), BINARY(N) 
and VARBINARY(N).


## In TAP 1.1

In WD-TAP-1.1, in addition of the above two columns, "arraysize" has 
been added. So the datatype descriptive columns are now:
         - datatype  - ?? (the description disappeared in this WD)
         - "size"       - ?? (idem)
         - arraysize  - ?? (idem)

With the following additional description:

         "The arraysize column gives the length of variable length 
datatypes, for
          example varchar(256); this arraysize does not map exactly to 
the VOTable
          arraysize attribute because the latter can specify the size 
and shape of a
          multi-dimensional array as well as the variable size.
          [...]
          In the next major version of TAP, the "size" column
          will be removed."

So, even in TAP-1.1 there will be no way to add information about arrays.

==> Furthermore, though I can understand the reason why "size" should be 
deprecated (collision with an ADQL reserved keyword....by the way, will 
we still have reserved keywords with the PEG grammar for ADQL?), is it 
really a good idea to call a column "arraysize" if it is not about an array?

==> And then, why having the same name as in VOTable if it does not do 
the same?


## In VODataService 1.1

In REC-VODataService-1.1 (used to describe published columns in TAP's 
entry point '/tables'), the datatype of a column can be expressed using 
two types of type:
         - VOTableType (e.g. <dataType xsi:type="vs:VOTableType" 
arraysize="*"> char </dataType>)
         - TAPType         (e.g. <dataType xsi:type="vs:TAPType" 
size="8" > CHAR </dataType>)

According to the XML schema of VODataService-1.1, TAPType is the only 
one that can have a "size" attribute defined as described in TAP 1.0 
(i.e. "The length of the variable-length data type."). Ok, that makes 
sense since it is only something coming from TAP.

==> By the way, is it also planned to deprecate "size" from 
VODataService as in TAP-1.1?

However, both VOTableType and TAPType can have an "arraysize" attribute 
defined as described in VOTable (i.e. an ArrayShape = " An expression of 
a the shape of a multi-dimensional array of the form LxNxM... where each 
value between gives the integer length of the array along a dimension. 
An asterisk (*) as the last dimension of the shape indicates that the 
length of the last axis is variable or undetermined.").

So, here, we have a completely different definition of "arraysize" than 
in WD-TAP-1.1.

==> Is there a mistake here? If yes, which standard has to be updated: 
VODataService or TAP? And in which direction?


## To conclude,

==> considering these three documents and knowing that TAP-1.1 is still 
in WD, how can we declare arrays in TAP_SCHEMA (and /tables result)?

I personally like to have something consistent and so I would go for 
re-defining the new column "arraysize" as in VODataService and VOTable.

==> But does it make sense to combine this VOTable piece of information 
with the datatypes of TAP (i.e. the so-called TAPType like VARCHAR, 
BIGINT, BLOB, ...)? If not, what other alternative(s) do we have?

Cheers,
Grégory



More information about the dal mailing list