[TAP] data type for column metadata

Douglas Tody dtody at nrao.edu
Tue Mar 31 16:37:08 PDT 2009


Hi -

This looks promising, however I worry about the proliferation
of metadata items for defining a column.  We are focusing on
coordinate values here and in this context this looks reasonable
and could be handy if we were only describing coordinate items.
But refsystem/refposition/flavor ("flavor" is awful) would only be
used by coordinate-type columns - for many other table columns they
would not be used at all.  It might be preferable to compress all
this information into a single string attribute, as we do with unit,
ucd, utype, etc.  Perhaps something like "wcs", "system", "frame",
or whatever.  This approach would also be more easily extensible
without having to change the schema.  It might be possible to go one
step further and have the meaning of this attribute differ depending
upon what type of datum is being described.

Normally I would argue for decomposing such data into parameters but
in this case we are perhaps building too much detailed data model
stuff into what is a generic container mechanism.  The schema has to
concisely describe any type of column.

> PS-Of course, most of this metadata applies to ADQL query writing; in
> PQL the param specification imposes limitations on values and systems
> and that does require that services implement some transformations
> (if their content is not in the standard system).

The point here is to describe the table columns so that the client
or user knows at least the basics of what they are dealing with.
As you say, for the POS and REGION constructs in parameter queries
it does not matter since they hide this information from the client
and either constrain the values or allow the client to specify the
reference frame.  However it still matters for the WHERE clause in
param query, much as for ADQL, and in both cases the client or user
needs to know how to interpret what comes back.  It would be nice if
we could provide basic information without the user having to go read
a paper.

 	- Doug





On Tue, 31 Mar 2009, Patrick Dowler wrote:

>
> I have been mulling things over and also talked with Arnold a bit, and this
> was the result:
>
> I think a simple extension to the column metadata would be as follows. The
> name and datatype would be mandatory and all others optional (as they don't
> apply in many cases, or are simply not important; if a service doesn't
> specify that should be OK).
>
> http://www.ivoa.net/Documents/latest/STC.html
> http://tycho.usno.navy.mil/systime.html
>
> name: the column name
> datatype: the SQL-ish datatype as discussed last week
> units: units the column values are in
> refsystem: reference system or scale
> - values taken from STC, Table 2 (time), Table 3 (space),
> - values from FITS WCS Paper 3 Table 1 (energy): spectral CTYPE?
> refposition: reference position
> - values taken from STC, Table 1
> flavor: what kind of a value is it?
> - could be taken from STC, section 4.4.1.2.2 (CARTESIAN, SPHERICAL, etc)
> - could be taken from STC section 4.4.1.4.2 (for doppler/velocity)?
> - includes differentiating JD vs MJD
> - could be taken from FITS WCS style generally
>
> Note: In STC one differentiates energy vs frequency vs wavelength somewhat
> implicitly via the units. In FITS WCS Paper 3 Table 1 there are explicit
> CTYPE values for these as well as the different velocity definitions. The
> text in STC section 4.4.1.4.2 says that a future version will be compliant
> with the FITS WCS std.
>
> Examples: name, datatype, units, refsystem, refposition, flavor
>
> * a column containing spatial positions:
>  pos, POINT, deg, ICRS, TOPOCENTER, UNITSPHERE
>
> * separate columns containing spatial coordinates:
>  ra, DOUBLE, deg, ICRS, TOPOCENTER, UNITSPHERE
>  dec, DOUBLE, deg, ICRS, TOPOCENTER, UNITSPHERE
> note: nothing really says which is LON and which is LAT
>
> * a column containing an instant of time
>  some_date, DOUBLE, d, UTC, TOPOCENTER, MJD
>
> * another column containing an instant in time
>  lastModified, TIMESTAMP,  <null>, UTC, <null>, <null>
> note: reference position not important for this sort of time value
>
> * a pair of columns containing energy bounds of an obervation
>  energy1, DOUBLE, m, WAVE, BARYCENTER, <null>
>  energy2, DOUBLE, m, WAVE, BARYCENTER, <null>
> note: using FITS WCS CTYPE for refsystem, no flavor needed for 1D
>
> * a column containing a  measured redshift
>  z, DOUBLE, <null>, ZOPT, BARYCENTER, <null>
> note: using FITS WCS CTYPE for refsystem
>
> * a column containing a measured velocity
>  v, DOUBLE, m s-1, VOPT, BARYCENTER, <null>
> note: using FITS WCS CTYPE for refsystem
>
> For many columns in a table, the values for refsystem, refposition, and flavor
> will be null (not applicable), but for coordinate values they are necessary.
>
> Comments on this?
>
> -- 
>
> Patrick Dowler
> Tel/Tél: (250) 363-0044
> Canadian Astronomy Data Centre
> National Research Council Canada
> 5071 West Saanich Road
> Victoria, BC V9E 2M7
>
> Centre canadien de donnees astronomiques
> Conseil national de recherches Canada
> 5071, chemin West Saanich
> Victoria (C.-B.) V9E 2M7
>
>


More information about the dal mailing list