Coordinates model - Working draft.

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Jan 8 13:18:26 CET 2019


Dear DM,

On Fri, Dec 21, 2018 at 04:41:09PM -0500, CresitelloDittmar, Mark wrote:
> > Similarly (and this is quite a bit more itchy to me right now), I'm
> > strongly advocating to make TimeInstant concrete and remove all its
> > derived types (ISOTime, JD, MJD, TimeOffset).  Instead, TimeFrame
> > should grow a timeorigin attribute.
> >
> 
> Full disclosure, I haven't read the apps thread re: TIMESYS, but as I
> recall, it works only because it requires that timeorigin is expressed in
> JD specifically.

Which is fine because it's metadata.  In annotation, flexibility is
what we need on the *data* side -- in general you don't want to force
data providers into rigid conventions because that would typically
force them to change their data in non-trivial ways.  That's why we
allow all kinds of time scales, reference positions, and (indirectly)
serialisation formats.

For *metadata*, that's a different thing -- this is part of the
annotation itself and is largely controlled by "us"; it is being
written at publication time by people looking at the VO.  Flexibility
there only complicates both model and implementations without buying
anything (except possibly a tiny bit of conversion work for very few
numbers external to the data itself).  

Hence, fixing time origin to a float (which certainly can be
represented in all conceivable serialisations) with a bespoke
interpretation (as a JD) doesn't hurt the generality of the model at
all but saves quite a bit of code (and hence bugs) in the clients.

> This is not OK for general usage, and without the restriction you get a
> cyclical problem in the model.

The cyclicity (that you actually don't get rid of as is, you just
hide it a bit better) is another indication of the unnecessary
complexity inflation I'm hinting at above.  

As soon as you attach time metadata to the time offset (which is
itself metadata), you have meta-metadata.  There are situations when
you *have* to have extra levels, but these kinds of cycles
("reification") always are potentical code bomb (i.e., apparently
inoccuous features that explode into many code lines in
implementations).  Let's not do it unless we must, and as I argue
above, I'm sure we can do without meta-metadata here just fine.


> > The rationale here is that the concrete form of the timestamp is a
> > *serialisation* issue, i.e., one of VOTable, FITS, or whatever else.
> > If the serialisation provides for having ISO-like dates or a binary
> > representation of civil dates or nothing of the sort shouldn't
> > determine whether you can serialise STC instances into them.
> >
> 
> Not sure I would call it a serialization issue, but yes.. its related.  The
> thing is, when we have time data, we need to convey how to interpret that
> value (which on its face is just a 'real' ).

Well, the trouble is that at least VOTable already has a distinction
between ISO string and floating point representations, and I'd expect
any format powerful enough to carray VODML annotation will have
similar mechanisms (e.g., relational database tables).  So, if you
put that distinction into the model, too, you have an immediate
conflict of responsibilities.  

Which is bad for many reasons, the most urgent of which is that
client writers have to decide what should happen if a <FIELD
xtype="timestamp"/> is annotated as MJDDate (say).

Just saying "it's invalid, refuse to work with it" is really
unsatifactory -- probably the document would work just fine if the
annotation is *removed* (because it's really a timestamp).  Telling a
user a document that looks fine can't be processed just because of
annotation that's superfluous in the first place (because the client
already knows it's a timestamp) is highly annoying, and looking at
the hacks clients writers currently make to let users work with
half-broken data I'm sure we'd be seeing lots of ugly heuristics and
conditional ignoring of annotation.

Why increase the number of errors document writers can make and
clients writers have to handle when there's nothing to be gained by
(in this case) the extra classes?


> This modeling makes it easy to convey that info.. you have a Column or
> Param which you identify as type "coords:domain.time.JD".  The alternative
> is to have a single Type (TimeInstant), and a format enumeration/flag.

Ha!  If you look at your phrase "format enumeration/flag" -- doesn't
that already shout "serialisation"?  The point I'm making above is
that the model simply shouldn't talk about formats, because that's
another level of representation.  The model just has TimeInstant --
done.

Shorter model, less conflicts, same capabilities -- what's not to
like?

         -- Markus


More information about the dm mailing list