content, format, ctype, or xtype ?

Douglas Tody dtody at nrao.edu
Thu May 14 05:42:46 PDT 2009


Hi Alberto -

Yes, this is exactly the issue I was alluding to.  The creators of
these tables often go to a lot of trouble to get them just the way they
are, and the community is used to seeing the data that way.  Plus one
would like to have table presentation automated and straightforward.
These considerations lead one to trying to change the table as little
as possible.

In the past here we have usually dealt with this by adding additional,
more standard columns for key items such as RA, DEC in standard units.
Evidently Vizier does the same.  Perhaps something along these lines
is the solution: if we have metadata such as the ID main or primary
RA, DEC (used by default for spatial searching) these could have
standard UCDs and units and probably be indexed.  There are only a
few of these needed, as you say.  When it comes to arbitrary table
columns and the complexities of reference frames I do not think we
want to change the external data.

I agree that some standards or recommendations along these lines would
be a good thing to have.

 	- Doug


On Thu, 14 May 2009, Alberto Micol wrote:

>> On 13 May 2009, at 21:37, Doug Tody wrote:
>>
>>     On Wed, 13 May 2009, Patrick Dowler wrote:
>>
>>     So.... UCD?
>>
>>     It looks like I object to everything but ucd: it allows one to say
>>     "this is a
>>     time". Maybe restriction is enough:
>>
>>     MJD: DOUBLE <->  datatype="double" ucd="time"
>>     ISO8601: TIMESTAMP  <-> datatype="char" ucd="time"
>>     STC-S: REGION or POINT <-> datatype="char" ucd="pos" ?
>> 
>> 
>> While I agree with Alberto that in "strong" interfaces with formal
>> parameters and data models we often want to constrain the units and
>> representation, I question whether this is a good idea when we merely
>> want to expose arbitrary external data (such as tables).  In this case
>> it is probably best, as well as simplest, to make as few changes to the
>> external data as possible.  Hence unless we are populating the fields
>> of a well defined VO data model we should pass through whatever is
>> in the external data table, and merely aim to describe it accurately.
>
> That would mean that ALL clients would have to know how to 
> interpret/translate/parse
> ALL possible combination of units/ucds/utypes. I would not call this "best 
> and simplest"
> from a client point of view.
>
> And if we consider the non negligeable fact, very well presented by Fabien 
> Chereau at the
> last Interop in Baltimore, that coding a client that can make sense of the 
> different VOTables
> returned by various data provides is a nightmare, then I think that the 
> solution to expose
> to the VO external data as they are, it is actually (from a pragmatic point 
> of view)
> a vo-stopper.
> /* Fabien showed how Virgo handles SIA/SSA responses: a mixture of guesses
> translated in complicated if/then/else playing with field names, ucds, 
> utypes, and units
> and most of the time even that fails.
> */
>
>
> If, on the other hand, the server already implements the translation of a 
> time field
> internally stored as "number of seconds since 1-JAN-19xx", into the VO 
> standard (e.g.) iso8601,
> all clients will know what to expect, without having to take any assumption.
> Only at the server side the correct knowledge to translate the internal 
> values
> to the VO standard might be available. A client instead might take wrong 
> assumptions.
>
> Notice that I am only asking to standardise units and representations, not 
> reference frames,
> and only for few well identified cases.
>
> So far we have spoken of:
>
> - angles (decimal degrees) [RA is an angle]
> - timestamps/datetimes (iso8601?)
>
> and I would add
>
> - time intervals (like exptime, period; in seconds?),
> - wavelengths (meters?),
> - frequencies (Ghz?),
> - energies (keV?).
> - magnitudes (in mag, and not in eg dmag)
>
> more?
>
> It should not be a big deal to standardise those most used concepts. The rest 
> untouched.
>
> Alberto
> I appreciate the problem of exposing the original data as they were expressed 
> by the authors of a catalog.
> That is indeed important, and I am even thinking whether we should pass the 
> original
> data untouched (as a display-string attribute?) along with the standard 
> representation of it.
> Again, we are only talking of few quantities here...
> Vizier already took this kind of approach when it was decided that the 
> original catalogs are stored
> and served as they were conceived by the authors, but extra columns (e.g. 
> RA,DEC FK5 J2000)
> are computed, stored, and served to allow homogeneous access throughout the 
> entire collection
> (the MAIN ucd was invented, I think, for this).
>



More information about the dal mailing list