MCT - model document delivery.

Mon Sep 21 10:41:30 CEST 2020

Hi Mark,

On Fri, Sep 18, 2020 at 11:00:03AM -0400, CresitelloDittmar, Mark wrote:
[the following refers to this:]
> >    <INSTANCE type="ds:Target">
> >      <COLLECTION role="targetPosDef">
> >       <INSTANCE ref="RA"/>
> >       <INSTANCE ref="DEC"/>
> >      </COLLECTION>
> >      <ATTRIBUTE role="objectClass" value="Star"/>
> >    </INSTANCE>

>     b) annotating the original field/param (again).
>         what you show in this case, is pointing to two fields which
>         happen to have nice ID's "RA"/"DEC".  If the IDs were
>         "sa3t2tfdaf3" and "pynw44wo", it would have no meaning to
>         the client at all.

Well, the client should of course ignore the ids whatever they are.

>           * which is the latitude? the longitude? the frame?

Target does not know about them, nor should it.  A target is a
target, whether it is identified by a coords spherical position or
some other way to denote a location in.  The good thing about keeping
such details out of ds:Target is that it will just continue to work
when we're moving into the solar system (where lat and long just
isn't enough).  Just say which fields and params belong to the target
designation and leave their annotation to a specialised class.

In this particular case, that would be one of:

> >   <INSTANCE type="coords:Position" id="pos1">
> >     <ATTRIBUTE role="ref_frame" value="ICRS"/>
> >     <ATTRIBUTE role="latitude" ref="DEC"/>
> >     <ATTRIBUTE role="longitude" ref="RA"/>
> >   </INSTANCE>

or

> >   <INSTANCE type="coords2:EquatorialPosition" id="pos2">
> >     <ATTRIBUTE role="ref_frame" value="ICRS"/>
> >     <ATTRIBUTE role="dec" ref="DEC"/>
> >     <ATTRIBUTE role="ra" ref="RA"/>
> >   </INSTANCE>

(or both) -- you see that as long as a client understands any of
coords and coords2, they are able to fully annotate RA and DEC, *and*
work out that they are the position for the Target of something.

>         The COLLECTION element seems to be indicating that "the
>         contents can be resolved into whatever type the model
>         element Target.targetPosDef resolves to".  Unless you do a)
>         above, this could be any number of things.  Which means
>         you'd need to also specify the type within the COLLECTION
>         node, basically re-serializing a particular position
>         instance.

No -- why should we?  The annotation of coordinates is the job of the
coordinates model.  For target, it sufficies to know that a position
given somewhere is the target of the (say) observation.

>     last) "use-any" relationship
>        The model shows that Target.position has a multiplicity of 1, the
> serialization here has 2 instances
>           * any model-aware validator would/should consider this invalid.
>           * clients would need to be model-aware, to know that they
>           should interpret this as 2 representations of the same
>           instance and not as a list of positions.

Whether use-any is a meta-model or a serialisation feature I'm not
sure, but I can't see how you'll get around something like this; and
clearly validators will need to be aware of whatever mechanism we
choose.

Cf. http://ivoa.net/documents/Notes/XMLVers/ for how we're trying to
make our XML schemas evolvable (which works fairly nicely for minor
version changes as far as I can see now, although I grow a bit tired
of having to write stuff like

  Although this schema is in version 1.1 now, the URL still ends in
  v1.0. This is to avoid unnecessarily breaking existing clients
  relying on the namespace as defined by version 1.0 of this
  specification. As laid out in the IVOA schema versioning policies
  (Harrison and Demleitner et al., 2018), although minor versions
  should never have been part of namespace URIs, for namespaces
  defined before this note they cannot be dropped despite their
  potential for confusion.

  http://ivoa.net/documents/VOResource/20180625/REC-VOResource-1.1.html#tth_sEc2.1

and

  Note that starting with VOTable 1.3, the namespace URI for VOTable
  will remain fixed at http://www.ivoa.net/xml/VOTable/v1.3 until the
  next major version as discussed in [7]

  http://ivoa.net/documents/VOTable/20191021/REC-VOTable-1.4-20191021.html#ToC16

again and again (and see below for major version updates).  Let's
think evolution through from the start this time.

>     Sounds kind-of like a substitution group.

There may be a slight analogue, but the purpose is different.

>     This seems like a lot of logic changes/work for the clients.

Since, as far as I know, all that exists are sketches and prototypes,
I argue this is when we need to work things out.  Once there's an
actual installed base of VO-DML consuming programmes, any change is
going to be a lot more painful (see the Schema Versioning Policy).

By the way, it's instructive to look at the trouble Registry has with
inheritance: Because all our schemas inherit from VOResource, the
moment we did a new major version of VOResource, *all* of
StandardsRegExt, SimpleDALRegExt, VODataService, TAPRegExt, and
Registry Interfaces need to be revised, *and* receive a new major
version.  Frankly, I can't see that happening, so in effect we'll
have VOResource 1 until we retire the entire XSD-based system.

If I built the Registry again, I'd certainly try hard to avoid this.
We're now in a position where we can avoid it for (many of) our data
models.  That's where my plea comes from: let's keep the models
separate as far as we possibly can, and in particular let's not have
cross-model inheritance.

Of course, I might be wrong and there *is* a way to build
cross-referring models that can independently evolve -- since I admit
there's something to be said for typed references I'd be happy to be
proven wrong.  Do you have a different plan for how to do this?

> Finally.. you ask "I'm not sure if VO-DML allows these types of
> "weakly-typed" references already -- Gerard?"
> This is really important.  The objections you are raising to the MC models
> have little to do with the model,
> and everything to do with VO-DML rules and Annotation syntax.
>    * A VO-DML model identifies the models it uses.. these are explicit
> declarations of the model/version.
>    * There is no allowance for a model to reference any "version of
> coords:Position"

Well, that the models are entangled certainly is a property of the
models themselves.  And I believe even with the current VO-DML we can
build at least annotations for distribution ("value/error"),
coordinates metadata, and photometry that are nicely isolated from
each other.

Now that I'm here, let be get back to your mail of Sep 15 that I'd
originally saved for when it was a bit quieter here again:

On Tue, Sep 15, 2020 at 12:18:45PM -0400, CresitelloDittmar, Mark wrote:
> > On Mon, Sep 7, 2020 at 10:54 AM Markus Demleitner <
 > > msdemlei at ari.uni-heidelberg.de> wrote:
>> > But shouldn't such applications simply look for Coordinates instances
> > rather than going as deep as Measurements?
> >
> Coordinates is deeper than Measurements.

You see, this is exactly what unnerves me.  To me, measurements in
the end just says "There's a value that's actually drawn from a
distribution, and here are that distribution's parameters" [ok, we're
not really there yet, but that's what value/error in the end isn't
really much different].

Yes: That may plausibly want multivariate distributions one day.

But that doesn't mean we should be abusing vectors for their
arguments.  Not any tuple of numbers is a vector [one of the greatest
elaborations of that fact I know is found in chapter 11 of vol. 1 of
the Feynman Lectures].

And in particular the simple fact that any measurement is, in the
end, an estimation of a distribution should not be built on the
complexities of spherical geometry, with proper motions, reference
frames, time scales, etc -- these are, to me obviously, a lot less
generic concepts.

Hence, I'd say the two models should stand independently next to each
other.  One annotates (conceptually exact) positions in space-time,
the other reflects that, where these are measurements or other
estimations, the values given are really parameters of distributions.

Nice isolation, while IMHO reflecting the mathematical background
just fine.

And while I've quoted from this mail, let me briefly comment the
other point made in this mail:

> > > The difference is basically:
> > >   * with just GenericMeasure your data product will only be making the
> > > statement
> > >       "This TABLE has COLUMNS"
> > >   * with the property-based Measures, you are describing Entities
> > >       "This Cube has Position, Time, etc...
> >
> > As you know I'm always trying to limit the number of ways we have in
> > the VO to do roughly the same thing, so: How is this different from
> > attaching a UCD to the GenericMeasure?
> >
> 
> IMO: you are mixing "model" and "serialization" here.
> I would rephrase your sentence as:
>   "A standard serialization of the Position, Time, etc model objects in
> VOTable COULD be to attach an appropriate UCD to the <GROUP> element."
> But, that does NOT mean that UCD should be a model element... it is a means
> of representing the concept which may work well for VOTables, but not
> necessarily for other formats.

Why would an attribute "ucd" be a problem for any sort of
serialisation any more than different classes?  UCDs are by no means
bound to VOTables -- they're not even specified in the VOTable
document.

          -- Markus