MCT - model document delivery.

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Sep 15 13:32:56 CEST 2020


Hi Marc,

Let me start at the end of your mail:

On Tue, Sep 08, 2020 at 10:34:14AM -0400, CresitelloDittmar, Mark wrote:
> But.. if you really feel that model is what the community is looking for, I
> encourage you to formalize it, apply it to the current projects, and submit
> it for consideration.

Well, it seems we don't have enough eyeballs for just one version of
Measurements.  So, I'd really like to avoid diluting what attention
we get on two different drafts.  Let's rather make another attempt at
working things out within what we have.  And anyway, I'm a big fan of
consensus in the important questions -- and at least two of the
things we deal with here count as "really important" in my book.

> On Mon, Sep 7, 2020 at 10:54 AM Markus Demleitner <
> msdemlei at ari.uni-heidelberg.de> wrote:
> The difference is basically:
>   * with just GenericMeasure your data product will only be making the
> statement
>       "This TABLE has COLUMNS"
>   * with the property-based Measures, you are describing Entities
>       "This Cube has Position, Time, etc...

As you know I'm always trying to limit the number of ways we have in
the VO to do roughly the same thing, so: How is this different from
attaching a UCD to the GenericMeasure?

> One of the 'threads' often used in the wish-list for these models is to
> 'easily identify and plot the positions in a file, with
> appropriate/normalized Frame specs'.

But shouldn't such applications simply look for Coordinates instances
rather than going as deep as Measurements?

>> Even more importantly, I'm still rather strictly against using
> > anything from coords in meas.  Having a value and an error is a lot
> > more fundamental than having coordinates (which more or less imply a
> > vector space if the word is to mean anything).  What we do now, on
> > the other hand, links meas to coords without any profit I can discern.
> >
> 
> The entire point of this work is to define a set of models which
> build on each other to form complex models.  The Coords model
> elements allow you to specify your vector space as needed.

Here, it's my turn to strongly disagree.  While I give you we've
already struggled with this in Shanghai, I think it's important to
reach agreement here, so let me try again.

You see, I am claiming we're doing these models in order to let
clients do smart things based on semantically strong annotation of
data (in case of Measurements, this would at first be "plot error
bars automatically", and once we get better, perhaps even "do
automatic error propagation").

If you disagree here, then we should try to somehow bridge that
dissent.

But if you don't disagree, then wouldn't you agree that if we can do
the same thing (actually, I keep claiming quite a bit more, but
that's beside the point) using a set of simple, independent data
models is preferable to entangling all these data models in order to
form increasingly complex ones?

> > Measurement:
> >   value: float
> >   error: Uncertainty
> >
> > Uncertainty (abstract class)
> >
> > NaiveError:
> >   value
> >
> > and perhaps
> >
> > Correlation:
> >   coefficient: float
> >   err1, err2: Uncertainty
> >
> >
> 
> I can't disagree more.
> IMO, the usefulness of Coords, is in the context of other objects (Meas).
> What you suggest above is overly simplistic, and does not satisfy the
> requirements of the Cube model (and probably the source model, but that is
> TBD).

Hm -- could you be a bit more specific as regards the "overly
simplistic"?  What kind of value-error relationships would not be
annotatable with this scheme?  Of course, I give you that the error
model itself is not sufficient for automatic error propagation, but
that's true for the current WD, too; it is, however, straightforward
to extend the model by deriving classes that are distribution-aware,
once we've got the clients to do the simple thing (associating errors
and values in the first place).

As to changes in Measurement affecting the Cube DM: This is exactly
what I was talking about above, when I claimed independent data
models are preferable to entangled ones, all other things being equal
-- separation of concerns is probably the most important principle
when building sustainable software systems.  It's why people invented
functions, classes, most of the GoF patterns, and today containers
and microservices.  I'd argue it would be wise to follow this
principle in our DMs, too.

Hence, if Cube really is affected if we streamline Measurements in
the proposed way, I'd much rather fix Cube rather than complicate
Measurements.

And again, apologies for being such a pest, but if we err here, we
will be causing our successors a lot of avoidable work in the best of
cases -- and prolong the VO's current STC plight in the most likely
outcome.

        -- Markus


More information about the dm mailing list