Time Series Cube DM - IVOA Note

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Fri Mar 31 13:49:28 CEST 2017


Hi Omar,

One short point, one longer one:

On Tue, Mar 28, 2017 at 10:21:46AM -0400, Laurino, Omar wrote:
> In the time series example, more than a time series *data model* I think
> time series can just be seen as *instances* of a common, more generic data
> model, that is itself a lightweight one. A client could specialize into

Absolutely -- at least my goal in this is to have time series just be
an NDCube that happens to have just one non-degenerate independent
axis that furthermore happens to have time-like STC annotation; I
think our adopters would rightfully develop solid resentments against
us if we did something very different.


>   <GROUP vodml-type="stc:Coordinate">
> >     <PARAMref vodml-role="value" vodml-type="Coord2" ref="pt"/>
> >   </GROUP>
> >   <PARAM ID="pt" xtype="POINT" datatype="real"
> >     arraysize="2" value="23.3 41"/>
> >   <GROUP vodml-type="stc:Coordinate">
> >     <GROUP vodml-role="value" vodml-type="Coord2">
> >       <PARAMref vodml-role="C1" ref="ra"/>
> >       <PARAMref vodml-role="C2" ref="dec"/>
> >     </GROUP>
> >   </GROUP>
> >   <PARAM ID="ra" value="23.3"/>
> >   <PARAM ID="dec" value="41"/>
> 
> 
> Would you have both annotations in the same file? How should a client
> (unaware of the enclosing model) know this is two different representations
> of the same coordinate rather than two distinct coordinates? I would rather
> be in favor of specific mapping rules for certain types, if that makes
> sense, which is what we already do for ivoa:Quantity. Coord2 would be
> serialized as a DALI POINT, if that makes sense. Admittedly, I haven't
> given this possibility enough thought, so I am not sure how convenient that
> would be or what repercussions it might have down the road.

I guess this is a good example for a distinction between two use
cases that we perhaps haven't sufficiently made in past DM work to
our detriment.  I think issues become a lot clearer if we separate
two related but actually distinct things:

(1) We want to define standard serialisations; that's stuff like an
obscore table, an SSA response, or whatever.  Here, we have to be
strict and precise on the serialisation details.  I think obscore
gets it right, simply saying "column/FIELD with name s_ra, floating
point type, in unit foo, UCD such-and-such, preferred description
this-and-that, reference frame ICRS".  Note how, this way, further
annotation is actually not necessary for anything in the core data
content, and that's how things must be if one wants to write
multi-service queries or join results from different services without
a lot of logic.  I'd say that's "baseline interoperability".

Personally, I'm not even sure if the notion of a data model even is
terrribly useful for these *as such*.  Grammars or, as in obscore, a
simple database schema seem more appropriate to me.  Be that as it
may, by now I'm convinced that even with VO-DML and the mapping
document, we'll still have do define concrete serialisation(s), for
me preferably in documents of type (1) themselves.  But that's, I'd
say, tangential for now.

Of course, once you add local extensions to such predefined
serialisations (e.g., extra columns in obscore, custom fields in DAL
responses), things are different, and then we have one example of
(2).

(2) We want complex metadata schemes for physical (or whatever)
entities which generically work whereever these entities turn up;
that could be filter names and zero points in photometry, time scales
and reference positions for times, or statistical properties, error
models, etc for measuments of all kinds.  These *may* go on top of
the well-defined serialisations, but where they really are needed is
when you have "free" responses, e.g., in TAP, datalink/SODA parameter
declarations, custom extensions, etc.  I'd call this "spontaneous
interoperability, because client and server don't need to pre-arrange
anything above the transport and annotation layers.

That's complex in the general case, but it's not black magic.  Hence,
(not only) I still think it's a disgrace that 15 years into the VO
all we have is a deprecated (and fairly limited) way to say "this
pair of columns [this POINT, POLYGON...] is a position in ICRS
BARYCENTER for Epoch J2015.0.  At least this very basic annotation
simply must work for essentially all representations that sensible
and almost-sensible people (as well as data-writing astronomers) may
choose.

With that distinction: No, I do not believe we'll end up at a useful
standard if we leave open in a given standard whether positions are
given as RA/DEC or a POINT in cases like (1).  They have to say that
or you can never, say, write an obscore query that works on more than
one service.

But the data models and in particular annotation scheme (make that a
plural once we tackle FITS or HDF5) must still be flexible enough to
cover (2).  Let's see that we can finally annotate VOTables in
suffient detail that a client can reliably bring a catalog in ICRS on
Epoch J1992.25 to Galactic in J2015 (or notice when that's not possible
for lack of proper motions).  I'd say that's an achievable goal.

And since I'd really like this to not share the fate of the
STC-in-VOTable Note, my feeling at this point is that proper error
treatment is for when we've gathered some experience; that would mean
that we can for now delay modelling correlated, non-Gaussian or
otherwise real-world errors (but keep Quantity open for adding that
later).  

A bird in the hand is worth two in the bush.  


      -- Markus


More information about the dm mailing list