Coordinates model - Working draft.

Thu Jan 17 10:51:18 CET 2019

Hi Mark,

On Tue, Jan 15, 2019 at 10:30:54AM -0500, CresitelloDittmar, Mark wrote:
> I haven't taken the time to digest this fully, but this paragraph struck me.
> You seem to be conflating the model and its annotation too directly.
> The model is not repeating serialization information, but representing the
> concepts.
> A physical quantity has these concepts
>   * it is continuous, or integral (datatype)
>   * it has units
>   * it has a value.
> These are serialized in VOTable as a PARAM
> The PARAM is annotated as a Quantity type.
> 
> I agree that it is important to know what the effect on serialization is..
> that is why we have example files which clearly show these elements.
> But the model is not tied to any serialization format.  The fact that

Right -- isolating the the data (meta-) model and the serialisation
format as much as possible is a valuable thing many reasons,
including separate evolution and separate implementability.

In particular, it has been my hope that the "VO-DML" layer can be
written "on top" of a conventional VOTable parser without changing
any of it (well, except that it has to somehow expose the model
elements, whether they are GROUPs or dedicated elements).

Since sufficiently capable container formats already talk about
types, units, and value serialisation, and these are handled by
lower-level libraries, it seems to me it's a good idea to not talk
about them in the model if you want model and format to be as well
isolated from each other as possible.

But, well -- I don't want to bug everyone with something that, given
the feedback, is a fairly singular concern, so I'd shelve my concerns
until others share them and shut up now.  

But to perhaps instill such concerns in a Parthian shot, let me
just illustrate again why I feel we're building something we will
have to apologise for later.

As far as I can see, the annotation for a string-serialised timestamp
according to current STC2 will look like this (ad-hoc VO-DML mapping):

  <VO-DML>
    <COLUMN ref="field_epoch" type="ISOTime/>
  </VO-DML>
  <FIELD ID="field_epoch" name="epoch" datatype="char" arraysize="*"
    xtype="timestamp"/>

That's a lot of types floating around here, and I'm sure our
adopters, when the see something like this, will ask:

(a) Why do you keep the information that epoch's values are in DALI's
ISO format in two places (COLUMN/@type and FIELD/@xtype)?

(b) And even if you have a good reason to have the same piece of
information in two places, why do you use two different vocabularies
(ISOTime vs. timestamp)?

I'd be great if I didn't have to answer them.

        -- Markus

PS:

> PS: the scope and requirements for the model are listed in the document.

Ah... well, I had in mind something a bit more concrete; you see, in
DM I think we're doing us a favour if each box in our UML diagrams
can be clearly and as unambiguously as possible (applying Occam's
razor) derived from a requirement, which again would be derived from
a very concrete use case ("merge two time series from different
sources; examples of time series we want to look at include ivo://X,
ivo://Y, and ivo://Z").  That way, I think a lot of the DM
discussions would become a lot less tedious.