Arrays in TAP_SCHEMA
Grégory Mantelet
gmantele at ari.uni-heidelberg.de
Wed May 31 16:34:22 CEST 2017
Dear DAL members,
Sorry to come back again with the "array" topic, but I have more and
more requests for having arrays in my TAP-Library (and, personally, I
will also need that quite soon) but I do not know how to proceed. Even
though nothing formally forbids it, there is actually no possibility to
declare arrays in TAP_SCHEMA...so in a way it is kind of
preventing/forbidding the usage of arrays if nobody can really know that
a column is an array.
I have searched in TAP-1.0, the coming TAP-1.1 and in VODataService in
the hope to find something leading us toward a solution. Here is what I
found and my related questions:
## In TAP 1.0
In REC-TAP-1.0, two columns of TAP_SCHEMA.columns let specify the type
of a published column, defined as follows:
- datatype - "ADQL datatype as in section 2.5"
- size - "length of variable length datatypes"
With the following additional description:
"Data types and how they map to VOTable datatypes are described
in section 2.5
above. The “size” gives the length of variable length
datatypes, for example
varchar(256); this size does not map to the VOTable arraysize
attribute when the
latter specifies the size and shape of a multi-dimensional array."
As written here, "size" does not aim to tell whether the value is a
scalar or an array ; it is just the N in CHAR(N), VARCHAR(N), BINARY(N)
and VARBINARY(N).
## In TAP 1.1
In WD-TAP-1.1, in addition of the above two columns, "arraysize" has
been added. So the datatype descriptive columns are now:
- datatype - ?? (the description disappeared in this WD)
- "size" - ?? (idem)
- arraysize - ?? (idem)
With the following additional description:
"The arraysize column gives the length of variable length
datatypes, for
example varchar(256); this arraysize does not map exactly to
the VOTable
arraysize attribute because the latter can specify the size
and shape of a
multi-dimensional array as well as the variable size.
[...]
In the next major version of TAP, the "size" column
will be removed."
So, even in TAP-1.1 there will be no way to add information about arrays.
==> Furthermore, though I can understand the reason why "size" should be
deprecated (collision with an ADQL reserved keyword....by the way, will
we still have reserved keywords with the PEG grammar for ADQL?), is it
really a good idea to call a column "arraysize" if it is not about an array?
==> And then, why having the same name as in VOTable if it does not do
the same?
## In VODataService 1.1
In REC-VODataService-1.1 (used to describe published columns in TAP's
entry point '/tables'), the datatype of a column can be expressed using
two types of type:
- VOTableType (e.g. <dataType xsi:type="vs:VOTableType"
arraysize="*"> char </dataType>)
- TAPType (e.g. <dataType xsi:type="vs:TAPType"
size="8" > CHAR </dataType>)
According to the XML schema of VODataService-1.1, TAPType is the only
one that can have a "size" attribute defined as described in TAP 1.0
(i.e. "The length of the variable-length data type."). Ok, that makes
sense since it is only something coming from TAP.
==> By the way, is it also planned to deprecate "size" from
VODataService as in TAP-1.1?
However, both VOTableType and TAPType can have an "arraysize" attribute
defined as described in VOTable (i.e. an ArrayShape = " An expression of
a the shape of a multi-dimensional array of the form LxNxM... where each
value between gives the integer length of the array along a dimension.
An asterisk (*) as the last dimension of the shape indicates that the
length of the last axis is variable or undetermined.").
So, here, we have a completely different definition of "arraysize" than
in WD-TAP-1.1.
==> Is there a mistake here? If yes, which standard has to be updated:
VODataService or TAP? And in which direction?
## To conclude,
==> considering these three documents and knowing that TAP-1.1 is still
in WD, how can we declare arrays in TAP_SCHEMA (and /tables result)?
I personally like to have something consistent and so I would go for
re-defining the new column "arraysize" as in VODataService and VOTable.
==> But does it make sense to combine this VOTable piece of information
with the datatypes of TAP (i.e. the so-called TAPType like VARCHAR,
BIGINT, BLOB, ...)? If not, what other alternative(s) do we have?
Cheers,
Grégory
More information about the dal
mailing list