[TAP] Summary: data type for column metadata
Patrick Dowler
patrick.dowler at nrc-cnrc.gc.ca
Wed Apr 15 10:16:26 PDT 2009
I want to summarise where I think we are on the topic of column metadata.
Column metadata appears in several places: The TAP_SCHEMA and VODataService
which provide metadata needed to understand the content and write queries,
and in VOTable to understand the results. We can't plausibly capture
everything in any of these locations.
TAP should not try to capture physics as this is way out of scope. While TAP
(like VOTable) includes places to put this kind of metadata (utype and ucd) it
does not specify any usage of those attributes: services or higher level
service specifications (e.g. the mythical Source Catalogue Service or a TAP-
based SimDB service) would specify how to use/set these attributes.
For all columns, you need to know the name, datatype, and units to write a
query. This is the basic level of metadata that would enable client tools to
help users write queries. The use of either numeric or SQL timestamp values
in RDBMSs means that datatype here has to be the SQL/ADQL datatype, not the
VOTable datatype. In addition to this, some columns will contain coordinate
values (position, energy, time, redshift, velocity) and the query writer
needs to know a bit more about the system to formulate the query. In order to
have tools be able to help (do the transforms) this needs to be exposed in a
standard way, rather than only via documentation. The simplest we came up with
is a single string like those in STC lib, except that some useful values such
as spectral coordinate type (as in FITS WCS) are not there, as well as a
simple way to differentiate JD and MJD.
Changes from current draft: this is different from the current draft in two
ways:
1. use ADQL datatypes rather than VOTable datatypes:
BOOLEAN, SMALLINT, INTEGER, BIGINT,
REAL, DOUBLE,
TIMESTAMP,
VARCHAR, VARBINARY,
POINT, REGION
It is not clear if we need the fixed-size types (char, binary) and to know the
sizes. It was also not clear if we need to include CLOB and BLOB now. No one
mentioned any real use/desire to have DATE or TIME.
2. add single additional optional metadata coordinate system spec, with values
coming from tables in STC, FITS, and a few in TAP directly to fill gaps (with
intention to deprecate them when a definitive source is written: eg Time WCS
paper and/or another rev of STC). Several values could be put together with
dashes (as in STC Lib); someone could could plausibly (outside TAP spec)
extend STC Lib with more useful constants so there is a nice picklist (and
thus also used in the VOTable output, so this approach largely solves result
metadata as well).
PROPOSAL: I would like to put these changes into the current TAP draft,
obviously still subject to tweaking and fixing.
Comments? Time is running short... asap please :)
PS-If everyone thought this was fun, the next email is on the more complex
metadata issues :-)
--
Patrick Dowler
Tel/Tél: (250) 363-0044
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9E 2M7
Centre canadien de donnees astronomiques
Conseil national de recherches Canada
5071, chemin West Saanich
Victoria (C.-B.) V9E 2M7
More information about the dal
mailing list