MIVOT: fully qualified attribute names

Mon Feb 6 11:08:41 CET 2023

Dear DM,

On Fri, Feb 03, 2023 at 03:50:40PM +0100, Laurent Michel wrote:
> Let's quickly summarize the answer (editor hat):
>
> MIVOT acts as a bridge between data and model.
>     - Data element are identified by ATTRIBUTE at ref
>     - Model elements are identified by their roles as defined by VODML.
>     - Roles are defined in the scope of the father objects.
>     - A role must be unique among the enclosing object children

Well: None of these imply fully qualified attribute names, right?

> Let's take an Example:
> =====================
>
> The following statement:
>
>     <ATTRIBUTE dmrole="ivoa:Quantity.val" dmtype="ivoa:real" ref="_column_xyz" \>

I think for figuring out the implications of plain attribute names,
one has to see the context, which is something like:

<INSTANCE dmtype="ivoa:RealQuantity">
  <ATTRIBUTE dmrole="ivoa:Quantity.val" dmtype="ivoa:real" ref="_column_xyz" \>

</INSTANCE>

That is: the type the attribute actually sits on (albeit not the root
class that defined the attribute in the hierarchy) of course remains
visible.

> It is simpler for sure, but with 2 major drawbacks:
>
>     1) There is no way for a split role to retrieve the model element it refers to.
>        This is a major issue for a validator which could no longer validate the MIVOT
>        statement against the mapped models.

I'd argue to the contrary: A validator, when seeing the ATTRIBUTE
element, is in the context of parsing a RealQuantity, and it will
know the content model of that.  The content model, however, has an
attribute named "val", *not* an attribute named "ivoa:Quantity.val".
With the fully qualified attribute names, the validator will have to
parse dmrole.  Incidentally, for all I can see, there aren't even
rules in MIVOT on how to do that.

So, I'd strongly suggest plain attribute names make the job a
validator (and more importantly, general parsers) simpler rather than
harder.

>     2) if the enclosing object has 2 attributes with a role ending with e.g. ".val"
>        there is no way to distinguish them.

Uh... how could that happen?  You're saying the current DM landscape
would allow a class like:

  class Composed(Parent1, Parent2):
    val: string   # from Parent1
    val: real     # from Parent2

If that is true, I'd strongly suggest we need to fix it.  Given the
way all computer languages supporting record types work, that would
be a huge bummer.

But frankly (and I didn't even bother to look it up in VODML), I don't
think that's possible, and so I'd suggest that will not be a problem
in practice.

> Working with shorten role could make sense for a client knowing the
> classes it has to parse but doing this at the mapping level would
> break the consistency of the annotations.

Well, I submit that almost always, the client will be parsing MIVOT
with a good idea of what it would like to see ("a thing with a value
and some sort of error specification", say).  I'd hence say it's a
safe assumption it knows "I'm looking for instances of class X, which
will have attributes value and error".

No, where I think the idea of the fully qualified attribute names
came from was to lessen the pains of inheritance.  Say, a client
doesn't know about RealQuantity but just about Quantity.  When we
fully qualify the attribute names, it still can find attributes that
are already part of Quantity.

I will not speculate on whether or not that's a useful thing to do --
as I said, we have far too little adoption of the whole thing to make
a call.  But *if* we actually want to enable this kind of thing, we
could go back to the proposal of annotating the class hierarchy used
in the instance, perhaps by making dmtype an element rather than an
attribute; that would still be a lot easier on annotators than the
fully qualified attribute names.

Whether or not it's useful (and if you think it's important, at least
provide a non-contrived prototype implementation of that), solving it
through the fully qualified attribute names seems extremely unwise,
as it explodes the annotation process.

Let me try to explain again.

When people do the annotation, they'll look for an appropriate class
and they'll find, say, RealQuantity and SymmetricGaussianError in
some diagram and see there's the attributes value and error on
RealQuantity, and fwhm on SymmetricGaussianError.  What I'd like them
to be able to write in their resource descriptors is:

  (RealQuantity) {
    value: @my_field
    error: (SymmetricGaussianError) {
      fwhm: @error_my_field
    }
  }

in order to link up the fields my_field and error_my_field.

That's, I think, relatively doable (at least if we improve our
documentation a bit).

However, with fully qualified attribute names, either

(a) the annotator will have to follow the class hierarchies to figure
out that, say, the value attribute is defined in the Quantity class
and fwhm in, perhaps, the GaussianErrorLike class, and then write the
"ivoa:Quanitity.value" in the annotation block.  I've tried this
"follow the inheritance hierarchy" thing and I dispaired.  I simply
don't want to hurl that at my users.

or (b), I (or rather, DaCHS) knows the VODML of all the classes
involved and follows the inheritance hierarchies of all attributes it
sees to the first class declaring them.  That's a possibility, but
for this reference implementation, as I said, that's an order of
magnitude more effort.

I think reference implementations are there to uncover such little
snags where a requirement of questionable value causes an
unproportional amount of work.  There's nothing wrong with having
such snags in a working draft (or PR) -- writing specs is hard.

But it is, I claim, wrong to just carry them over to a REC once we
*have* noticed them -- unless, of course, there is an overriding
reason that justifies all that extra work.  But that I still don't
see.

On the other hand, if that reason exists, we at least need to specify
an implementable and clear algorithm that lets me compute all
qualified attribute names for all attributes in a piece of VODML.
For all I can tell, that is missing from MIVOT, so far -- the thing
with "go up to the first class defining the attribute" is just my
interpretation of the examples given.

          -- Markus