[TAP] data type for column metadata
Patrick Dowler
patrick.dowler at nrc-cnrc.gc.ca
Tue Mar 24 17:02:59 PDT 2009
I will take a stab at these as it is an extensive list of the issues involved.
This is more "how to deal with these in an RDBMS" rather than "the right way
to deal with them", but I think that is the issue here :-)
On 2009-03-24 13:27:31 Arnold Rots wrote:
> For a Timestamp:
> What is the data type and precision?
> Could be datetime
> Or could be a floating point number
True. We use timestamp (aka datetime) for lastModified values and release
dates; we use double for observation start and end times. But see below for
the rest of the story.
So from another post, the datatype would be TIMESTAMP or DOUBLE.
> What kind of parameter does it represent?
> OK, that would be time instant, if we are talking about timestamps
Yes. In response to Gerard's list of data types in SQL, I mentioned that I
have never found a good use for either DATE or TIME types - only TIMESTAMP.
Maybe other people have a different experience?
> If it is a coordinate value, what coordinate system does it refer to?
> The Time Scale (TT, UTC, TAI, GPS, TCG, TDB, TCB, ...)
It turns out that RDBMSs vary in their treatment of time zone. In practice we
use either local time (for lastModified timestamps), UTC (for data release
dates), and MJD (for observation start and end times). For the timestamp
values, the application has to "know" the timezone in order to extract the
value from the DB correctly. Even for the MJD values, we just read the double
from the DB but the application still "knows" it is an MJD at some level.
Now, we chose MJD for "astronomical times" because it is much easier to
compute things (histograms, statictics, etc) when the number is directly
accessible to SQL.
> The Reference Position
> If it is relative (elapsed) time, the time zero point
If by this you mean something like exposure time, you just need a numeric
value and have to know the units. The rest depends on the data model (see
below). You do not need to know the zero point to express an amount of time.
If you want to express a time interval, that could be done with a start,end
or a start,duration -- in which case you have the zero-point as fully
specified as you can (given the other points).
(Note: I was not successful in getting an interval type into ADQL; that means
TAP services would have to expose separate columns for start and end or start
and duration and the user would have to use the two together).
> How is it represented?
> ISO-8601 (with the CCYY-MM-DD[Thh:mm:ss[.s...]] restriction)
I am assuming from context that you mean this in the sense of how are values
exchanged between client and service. This is important when you go to
serialise a value, presumably to give it to someone else (eg some other piece
of software). I would argue that for timestamps you have to include the
timezone in that ISO-8601 variant above, eg: CCYY-MM-DDThh:mm:ss.sZ in order
to carry all the necessary information. Otherwise, the recipient has to
assume the timezone in order to parse into the numeric date value (that most
software actually uses). Most software libraries will happily parse and
assume "local" timezone, which in the VO will mostly be wrong :-)
> JD
> MJD
Since they are numbers (probably double) these are expressed with the usual
arabic symbols. That does mean that it is not so self-contained as an
ISO-8601 format w/ timezone as above. I do not know where one would say a
column is JD vs MJD... hopefully someplace more machine-usable than the
documentation or comments :)
> Where does it fit into the information object?
> E.g., the time a photon was received
> or the time the record was recorded
> or the time this particualr file was written
These do not have anything to do with TAP per se. The TAP metadata and the
VOTable output format allow for utypes to be attached to columns and if there
is a data model then that mechanism could be used... but there need not be a
data model at all, in which case users just have to "know" (eg learn
out-of-band) what the content means. That is the necessary nature of a
low-level protocol, IMO.
I hope this helps clarify; it is long enough that it may well not :-(
--
Patrick Dowler
Tel/Tél: (250) 363-0044
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9E 2M7
Centre canadien de donnees astronomiques
Conseil national de recherches Canada
5071, chemin West Saanich
Victoria (C.-B.) V9E 2M7
More information about the dal
mailing list