[TAP] data type for column metadata

Wed Mar 25 11:21:46 PDT 2009

Patrick Dowler wrote:

> This is more "how to deal with these in an RDBMS" rather than "the  
> right way
> to deal with them", but I think that is the issue here :-)

The engineering question is how to make the best trade-offs.   
Sometimes a "pragmatic" approach is best.  But sometimes a pedantic  
approach is really the more pragmatic one.  Timekeeping is always  
more subtle than it seems.  Sweep an issue under the rug in one place  
and the rug bunches up in another.

> We use timestamp (aka datetime) for lastModified values and release
> dates; we use double for observation start and end times.

These are all expressions of epochs.  A count of seconds (or days or  
years) from some reference, whether 1970 or the birth of ones  
favorite deity.  If the reference is not expressed explicitly, it  
must be implicit in the definition of the field.  The precise  
definitions (of units, etc.) also change with timescale.

> In response to Gerard's list of data types in SQL, I mentioned that  
> I have
> never found a good use for either DATE or TIME types - only TIMESTAMP.
> Maybe other people have a different experience?

Time often represents angles.  RA, sidereal time, hour angle.

Dates reference into schedules.  (In a previous message I  
distinguished a date and an epoch.)

Arnold Rots wrote:

> Sorry, here you fall off the cliff: MJD DOES NOT IMPLY A TIME SCALE!

Which is to say that timescale metadata must be provided explicitly  
(fields or table headers), or implicitly in table definitions.

MJD also just happens to express a format like those provided by ISO  
8601.  Converting between representations is sometimes, but not  
often, trivial.  VO standards must provide enough context -  
appropriate metadata - to fully characterize astronomical and civil  
quantities, including those relating to time.

> All other time stamps, such as modification times, should ideally be
> in UTC on the surface of the earth: i.e., convert your timezone stamps
> to UTC (which is trivial if you know your timezone, of course)

Not very trivial - consider daylight saving rules or systems that  
must function between timezones.  Specific solutions may be  
straightforward.  General solutions are not.  This will mostly be an  
application level issue.

UTC is often the right choice of timescale (whatever the  
representation).  But just like STC is careful to capture expressions  
of spatial locality - such as distinguishing between topocentric and  
barycentric coordinates - so we will sometimes have to capture  
expressions of temporal locality.  This is commonplace in data  
acquisition systems, for instance.  UTC is a good (though endangered)  
standard, but it does not replace all local apparent, mean, standard  
(or daylight saving) timescales.

Markus Demleitner wrote:

> I would also maintain that it doesn't matter for writing queries,  
> since, as
> pointed out above, I have strong doubts that we want to build so much
> physics into your protocol that queries "violating physics" become  
> some
> kind of error.

For a more "semantic" perspective see my presentation from the joint  
Semantics/VOEvent session at the Baltimore IVOA:

	http://www.ivoa.net/internal/IVOA/InterOpOct2008VOEvent/ 
semantic_voevent.pdf

Physical reality (that is, "domain knowledge") does matter in the  
sense of a higher order normalization of DB semantics.  Utypes and  
UCDs provide pragma for tagging fields with semantic hints.  But some  
semantics will always require interpreting fields jointly.  With time  
comes default or explicit metadata needed to interpret epochs,  
intervals, calendar dates and clock times.  Formats for timestamps is  
only one of the issues.

I see others have just replied.  I'll second Arnold's latest  
message.  Note also that "time" can mean "phase".

Rob