[TAP] Summary: data type for column metadata

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Thu Apr 16 10:02:12 PDT 2009


On Thursday 16 April 2009 08:21:19 Ray Plante wrote:
> Hi type enthusiasts,
>
> Thanks, Pat, for the summary and calling for questions.  My interest, of
> course, is in adapting VODataService accordingly.
>
> On Wed, 15 Apr 2009, Patrick Dowler wrote:
> > 1. use ADQL datatypes rather than VOTable datatypes:
> >
> > BOOLEAN, SMALLINT, INTEGER, BIGINT,
> > REAL, DOUBLE,
> > TIMESTAMP,
> > VARCHAR, VARBINARY,
> > POINT, REGION
>
> I like this list.  I think we also need to include a recommendation (at
> least) for how these should be mapped to VOTable types.  This is not only
> for consistency in TAP responses, but also for describing a table
> in the registry outside the context of TAP (e.g. describing the table
> returned from an SIA query).  Does this seem reasonable?
>     ADQL type      VOTable
>     BOOLEAN        boolean
>     SMALLINT       short
>     REAL           float
>     DOUBLE         double
>     TIMESTAMP      char arraysize="*", (format?)
>                     (or is it numeric?)

As mentioned by others, format would be ISO8601; as mentioned by Gerard awhile 
back, we probably need to be specific about which variant of ISO8601 to use and 
it would be nice (although not required) to find a format that most RDBMSs 
accept directly. 

In my experience with RDBMSs (sybase, DB2, postgresql) the following is 
correctly interpreted by all:

yyyy-MM-dd HH:mm:ss.SSS

Sybase and postgresql are OK with the T separator between the date and the 
time, but DB2 does not like it.

>     VARCHAR        char arraysize="*"
>     VARBINARY      char arraysize="*", format not specified

Format for VARBINARY could be hexadecimal (2 chars per byte) which avoids need 
for terminators or escaping (octal.... ugh).

>     POINT          char arraysize="*", STC/s format
>     REGION         char arraysize="*", STC/s format

Yes, although one can (should? must?) put much of the coordinate system 
content into a separate param (or group), use the ref= attribute of the FIELD 
describing the point/region column, and not repeat it in every row. 

> We could get away with making this mapping only a recommendation if the
> SCHEMA and/or VODataService description indicated the mapping.
>
> > 2. add single additional optional metadata coordinate system spec
>
> I like the idea of relying on the existing solution available in VOTable
> using GROUPs and ucds/utypes.  Of course, it would be good to capture this
> in the registry description via VODataService.  May I propose adapting the
> GROUPs model into VODataService.  Flatter structures makes searching
> easier so it probably would look exactly like it.

It is true that the VOTable needs as much metadata as possible, and in that 
sense I think takes priority over also including that metadata in the 
TAP_SCHEMA and/or VODataService. If we do not add this extra metadata to 
TAP_SCHEMA (and it could be considerable work to make the list of values 
and/or rules for how to make up your own) then a client application could 
still resort to a query with MAXREC=0 to see what the output metadata would 
look like. This would be less work now and not preclude extending TAP_SCHEMA 
later on. 

This does mean that TAP should specify how the column names in TAP_SCHEMA (and 
the query) relate to the FIELDs in the VOTable? Specifically, which attribute 
of FIELD holds the (fully-qualified) column name? Otherwise, people could not 
put the two together and understand the VOTable fully.

It also means that (as mentioned above) that the VOTable should extract the 
common coordinate system content from STC/S formatted values, put them in a 
PARAM (or GROUP), and refer to them from the FIELD via the ref attribute, as 
described in the http://www.ivoa.net/Documents/latest/VOTableSTC.html note. 
This is pretty straight-forward and seems like the "right" way to do it 
anyway.

So, we agree that the extra coordinate system metadata is out (at least for 
TAP 1.0) and the caller will have to do a MAXREC=0 query to get this 
information, if it is applicable and available?

Side note: Mark Taylor proposed a format attribute for FIELD where one could 
specify the encoding (ISO8601, hex, STC-S, etc). That would be a nice 
addition...

> I sense some consensus on these two questions so far.  I would like to
> immediately turn this around into a VODataService proposal.  May I enlist
> the respondants to Pat's summary for consultation on this proposal?  (So
> far, this has been Pat, Gerard, Markus, and Mark.)

PS-Sure, I can help; these have to be consistent even if not equivalent.

-- 

Patrick Dowler
Tel/Tél: (250) 363-0044
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9E 2M7

Centre canadien de donnees astronomiques
Conseil national de recherches Canada
5071, chemin West Saanich
Victoria (C.-B.) V9E 2M7



More information about the dal mailing list