[TAP] Summary: data type for column metadata

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Wed Apr 15 10:16:26 PDT 2009


I want to summarise where I think we are on the topic of column metadata. 

Column metadata appears in several places: The TAP_SCHEMA and VODataService 
which provide metadata needed to understand the content and write queries, 
and in VOTable to understand the results. We can't plausibly capture 
everything in any of these locations.

TAP should not try to capture physics as this is way out of scope. While TAP 
(like VOTable) includes places to put this kind of metadata (utype and ucd) it 
does not specify any usage of those attributes: services or higher level 
service specifications (e.g. the mythical Source Catalogue Service or a TAP-
based SimDB service) would specify how to use/set these attributes.

For all columns, you need to know the name, datatype, and units to write a 
query. This is the basic level of metadata that would enable client tools to 
help users write queries. The use of either numeric or SQL timestamp values 
in RDBMSs means that datatype here has to be the SQL/ADQL datatype, not the 
VOTable datatype. In addition to this, some columns will contain coordinate 
values (position, energy, time, redshift, velocity) and the query writer 
needs to know a bit more about the system to formulate the query. In order to 
have tools be able to help (do the transforms) this needs to be exposed in a 
standard way, rather than only via documentation. The simplest we came up with 
is a single string like those in STC lib, except that some useful values such 
as spectral coordinate type (as in FITS WCS) are not there, as well as a 
simple way to differentiate JD and MJD.

Changes from current draft: this is different from the current draft in two 
ways:

1. use ADQL datatypes rather than VOTable datatypes:

BOOLEAN, SMALLINT, INTEGER, BIGINT, 
REAL, DOUBLE, 
TIMESTAMP,
VARCHAR, VARBINARY, 
POINT, REGION

It is not clear if we need the fixed-size types (char, binary) and to know the 
sizes. It was also not clear if we need to include CLOB and BLOB now. No one 
mentioned any real use/desire to have DATE or TIME.
 
2. add single additional optional metadata coordinate system spec, with values 
coming from tables in STC, FITS, and a few in TAP directly to fill gaps (with 
intention to deprecate them when a definitive source is written: eg Time WCS 
paper and/or another rev of STC). Several values could be put together with 
dashes (as in STC Lib); someone could could plausibly (outside TAP spec) 
extend STC Lib with more useful constants so there is a nice picklist (and 
thus also used in the VOTable output, so this approach largely solves result 
metadata as well).

PROPOSAL: I would like to put these changes into the current TAP draft, 
obviously still subject to tweaking and fixing. 

Comments? Time is running short... asap please :)

PS-If everyone thought this was fun, the next email is on the more complex 
metadata issues :-)


-- 

Patrick Dowler
Tel/Tél: (250) 363-0044
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9E 2M7

Centre canadien de donnees astronomiques
Conseil national de recherches Canada
5071, chemin West Saanich
Victoria (C.-B.) V9E 2M7



More information about the dal mailing list