[TAP] data type for column metadata
Rob Seaman
seaman at noao.edu
Wed Mar 25 11:21:46 PDT 2009
Patrick Dowler wrote:
> This is more "how to deal with these in an RDBMS" rather than "the
> right way
> to deal with them", but I think that is the issue here :-)
The engineering question is how to make the best trade-offs.
Sometimes a "pragmatic" approach is best. But sometimes a pedantic
approach is really the more pragmatic one. Timekeeping is always
more subtle than it seems. Sweep an issue under the rug in one place
and the rug bunches up in another.
> We use timestamp (aka datetime) for lastModified values and release
> dates; we use double for observation start and end times.
These are all expressions of epochs. A count of seconds (or days or
years) from some reference, whether 1970 or the birth of ones
favorite deity. If the reference is not expressed explicitly, it
must be implicit in the definition of the field. The precise
definitions (of units, etc.) also change with timescale.
> In response to Gerard's list of data types in SQL, I mentioned that
> I have
> never found a good use for either DATE or TIME types - only TIMESTAMP.
> Maybe other people have a different experience?
Time often represents angles. RA, sidereal time, hour angle.
Dates reference into schedules. (In a previous message I
distinguished a date and an epoch.)
Arnold Rots wrote:
> Sorry, here you fall off the cliff: MJD DOES NOT IMPLY A TIME SCALE!
Which is to say that timescale metadata must be provided explicitly
(fields or table headers), or implicitly in table definitions.
MJD also just happens to express a format like those provided by ISO
8601. Converting between representations is sometimes, but not
often, trivial. VO standards must provide enough context -
appropriate metadata - to fully characterize astronomical and civil
quantities, including those relating to time.
> All other time stamps, such as modification times, should ideally be
> in UTC on the surface of the earth: i.e., convert your timezone stamps
> to UTC (which is trivial if you know your timezone, of course)
Not very trivial - consider daylight saving rules or systems that
must function between timezones. Specific solutions may be
straightforward. General solutions are not. This will mostly be an
application level issue.
UTC is often the right choice of timescale (whatever the
representation). But just like STC is careful to capture expressions
of spatial locality - such as distinguishing between topocentric and
barycentric coordinates - so we will sometimes have to capture
expressions of temporal locality. This is commonplace in data
acquisition systems, for instance. UTC is a good (though endangered)
standard, but it does not replace all local apparent, mean, standard
(or daylight saving) timescales.
Markus Demleitner wrote:
> I would also maintain that it doesn't matter for writing queries,
> since, as
> pointed out above, I have strong doubts that we want to build so much
> physics into your protocol that queries "violating physics" become
> some
> kind of error.
For a more "semantic" perspective see my presentation from the joint
Semantics/VOEvent session at the Baltimore IVOA:
http://www.ivoa.net/internal/IVOA/InterOpOct2008VOEvent/
semantic_voevent.pdf
Physical reality (that is, "domain knowledge") does matter in the
sense of a higher order normalization of DB semantics. Utypes and
UCDs provide pragma for tagging fields with semantic hints. But some
semantics will always require interpreting fields jointly. With time
comes default or explicit metadata needed to interpret epochs,
intervals, calendar dates and clock times. Formats for timestamps is
only one of the issues.
I see others have just replied. I'll second Arnold's latest
message. Note also that "time" can mean "phase".
Rob
More information about the dal
mailing list