content, format, ctype, or xtype ?

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu May 14 01:24:21 PDT 2009


On Wed, May 13, 2009 at 06:16:15PM +0200, Francois Ochsenbein wrote:
> (a) Do we need a new attribute to specify an ISO-8601-formatted time ?
>     The possible answers are:
>     (a0) No, restrict the expression of time to JD or MJD or ... number
>     (a1) Yes, a new attribute is required (which name ? which content?)
>     (a2) No, use a new datatype (equivalent to 'char' as far as the
>              data storage is concerned, with an arraysize reflecting
> 	     the accuracy, i.e. '19' for full date YYYY-MM-DDThh:mm:ss)
>     (a3) No, use a utype related to the STC data model
>              (e.g. utype='stc:AstroCoords.Time.TimeInstant.ISOTime')
>     (a4) No, use a UCD (e.g. ucd='time.iso8601'; could be secondary
>              e.g.  ucd='time.release;time.iso8601')
>     (a5) No, use a special unit (e.g. unit='"iso8601"')
In my order of preference: (a4), (a1), (a5).  (a0) is probably out
since it's too restrictive for something striving to expose
pre-existing data of all kinds, (a3) is probably out because data
models may have various "items" that are times and thus need utype to
tell those apart.

>          
> (b) Except if (a0) is decided, how should the other field attributes 
>     be filled:
>     -- which datatype: only char 
>        (remember, VOTable may convey FITS or binary data...)
>     -- which units ? must be empty ?
>     -- ucd / utype ?
>     -- precision ?
With a datetime type, units for such FIELDs would be ignored, and ucd
and utype would depend on what the time describes and, if defined,
what role the value fills.

Let me also comment Arnold Rots
(200905132003.n4DK3qgf013010 at xebec.cfa.harvard.edu):

> There is one more option:
>      (a6) No, use an STC-S string: "Time TT GEOCENTER 2009-05-13T19:09:30"
>           or: "Time TT GEOCENTER MJD 54964.798264"
No, let's not do this.  In at least 99% of the tables out there, the
time scale and the refpos are *meta*data, i.e., the same for all
values in a column.  It should therefore, in general, be represented
in the metadata section (i.e., FIELD and GROUP) rather than in the
data itself.

Apart from philosophical reasons, the STC data model is definitely
not optimized for easy and fast processing.  Having to parse, say,
10000 datetime specs like this with a generic STC-S library[1] will
have a severe impact on performance, probably by far dominating the
total runtime for reading the VOTable.

Also, I'm pretty sure nobody will actually store STC-S in their
databases, if only because no database will be able to compare
Time TT GEOCENTER 2009-05-13T19:09:30 with
Time TDT BARYCENTER 2009-05-13T19:09:28.8
(and rightly so), so when writing such tables servers would have to
combine data and metadata needlessly.

However, I agree with Arnold that having artificial units ("iso8601",
"H:M:S") for datetimes is unattractive; however, at least there's
code out there handling this mess...


Cheers,

             Markus

[1] So, matching an RE for an ISO string and forgetting about the rest
doesn't count; of course you could hand-craft a quick hack that just
matches "Time <scale> <refpos>? <ISO time>", but then what's the
point of calling it STC-S?



More information about the dal mailing list