content, format, ctype, or xtype ?

Francois Ochsenbein francois at vizier.u-strasbg.fr
Wed May 13 09:16:15 PDT 2009


Well, I feel it's important to summarize what we are talking about
and try to reach a consensus, if possible... Sorry for the length
of this message, but I hope it will help to take a final decision

This discussion which involves DAL (TAP) and VOTable groups started
essentially because a requirement to deal with time and STC-regions
was raised in TAP. The question then is: how to characterize as 
"time" and "STC-regions" these entities in a VOTable ? Can these 
be accomodated by the existing VOTable FIELD attributes or do we 
need yet another attribute for this ?

Let me summarize the various attributes currently existing which
characterize the columns in VOTable:

-- datatype : integer / float / char / boolean
              represents the hardware storage 

-- unit :     examples are:  deg  arcsec/yr km/s  AU ...
              (specifies also the physical dimension)

-- ucd :      examples are  pos.eq.ra  pos.pm;pos.eq.ra
              (specifies semantics with a restricted controlled vocabulary)

-- utype :    example "stc:AstroCoords.Position2D.Value2.C1"
              (specifies the exact parameter role in a data model)

-- width :    number of characters required for the String edition of value
              (e.g. '3' for an integer implies a value between -99 and 999)

-- precision: number of decimals or significant digits
              (e.g. 'F2' for a representation with 2 decimals)

Norman's terms could be mapped into
'value space'    --> datatype (all possible representations of a value 
                     which can be stored in our computers)
'lexical space'  --> width + precision (the String representation of
                     the value) + unit

I would like to add that, from a physicist's point of view, the 
'frame' or 'system' in which the value is expressed is also
fundamental (e.g. a velocity in a frame tied to Earth can't be
compared directly with a heliocentric velocity); this latter 
knowledge is covered essentially by the 'utype' attribute.

Now the question about the 2 entities required by TAP
(notice they both refer to space/time):

1. STC-string characterising regions:
   -- is obviously of datatype = string (= array of char's)
   -- is an expression born in STC data model, and therefore
      is clearly related to a data model
   ==> therefore would logically be described by a utype
       (as pointed out several times)

2. Time is more tricky: 
   -- there is no "time" datatype (from the database point of view,
      there is no unique way of storing a time)
   -- units could be, as already pointed out, seconds, days, weeks,
      years, ... all these represent a time (and, as Alberto pointed
      out, why do we require sexagesimal here when we agreed to
      remove it from angular quantities ?)
   -- ISO-8601, as already pointed out, is not a unit, not a datatype,
      and even not a system (from the database point of view, the
      TIMESTAMP represents local time, and TIMESTAMPs from different
      databases can't be compared directly)

Therefore the new attribute ('ctype' or 'xtype' or ..?) would
essentially be added only to specify 'this string is a time
expressed in ISO-8601'. Clearly this would help for e.g.
interpreting this string as a number to be used along the
x-axis of a plot (a nice topcat plot :-) but does not solve
all the other problems (which time, how to compare times delivered 
from different databases, ...)

Finally, from the VOTable point of view, the questions I would
like to get a definitive answer to is:

(a) Do we need a new attribute to specify an ISO-8601-formatted time ?
    The possible answers are:
    (a0) No, restrict the expression of time to JD or MJD or ... number
    (a1) Yes, a new attribute is required (which name ? which content?)
    (a2) No, use a new datatype (equivalent to 'char' as far as the
             data storage is concerned, with an arraysize reflecting
	     the accuracy, i.e. '19' for full date YYYY-MM-DDThh:mm:ss)
    (a3) No, use a utype related to the STC data model
             (e.g. utype='stc:AstroCoords.Time.TimeInstant.ISOTime')
    (a4) No, use a UCD (e.g. ucd='time.iso8601'; could be secondary
             e.g.  ucd='time.release;time.iso8601')
    (a5) No, use a special unit (e.g. unit='"iso8601"')
         
(b) Except if (a0) is decided, how should the other field attributes 
    be filled:
    -- which datatype: only char 
       (remember, VOTable may convey FITS or binary data...)
    -- which units ? must be empty ?
    -- ucd / utype ?
    -- precision ?

Should we envisage a vote to achieve a final decision ?

--Francois
=======================================================================
Francois Ochsenbein    ------   Observatoire Astronomique de Strasbourg
   11, rue de l'Universite 67000 STRASBOURG  Phone: +33-(0)390 24 24 29
Email: francois at astro.u-strasbg.fr (France)    Fax: +33-(0)390 24 24 17
=======================================================================



More information about the dal mailing list