MIVOT: fully qualified attribute names

Mon Feb 13 14:26:07 CET 2023

Dear DM,

Le 06/02/2023 à 11:08, Markus Demleitner a écrit :
> Dear DM,
> 
> On Fri, Feb 03, 2023 at 03:50:40PM +0100, Laurent Michel wrote:
>> Let's quickly summarize the answer (editor hat):
>>
>> MIVOT acts as a bridge between data and model.
>>      - Data element are identified by ATTRIBUTE at ref
>>      - Model elements are identified by their roles as defined by VODML.
>>      - Roles are defined in the scope of the father objects.
>>      - A role must be unique among the enclosing object children
> 
> Well: None of these imply fully qualified attribute names, right?
>

Roles identify object components: thus we have to make sure they are unique.

>> Let's take an Example:
>> =====================
>>
>> The following statement:
>>
>>      <ATTRIBUTE dmrole="ivoa:Quantity.val" dmtype="ivoa:real" ref="_column_xyz" \>
> 
> I think for figuring out the implications of plain attribute names,
> one has to see the context, which is something like:
> 
> <INSTANCE dmtype="ivoa:RealQuantity">
>    <ATTRIBUTE dmrole="ivoa:Quantity.val" dmtype="ivoa:real" ref="_column_xyz" \>
> 
> </INSTANCE>

The way to map ivoa:Quantity has been clarfied by MR #180 (element_instance.tex)
see section 4.9 listing 13.

> 
> That is: the type the attribute actually sits on (albeit not the root
> class that defined the attribute in the hierarchy) of course remains
> visible.
> 
>> It is simpler for sure, but with 2 major drawbacks:
>>
>>      1) There is no way for a split role to retrieve the model element it refers to.
>>         This is a major issue for a validator which could no longer validate the MIVOT
>>         statement against the mapped models.
> 
> I'd argue to the contrary: A validator, when seeing the ATTRIBUTE
> element, is in the context of parsing a RealQuantity, and it will
> know the content model of that.  The content model, however, has an
> attribute named "val", *not* an attribute named "ivoa:Quantity.val".
> With the fully qualified attribute names, the validator will have to
> parse dmrole.  Incidentally, for all I can see, there aren't even
> rules in MIVOT on how to do that.
> 
> So, I'd strongly suggest plain attribute names make the job a
> validator (and more importantly, general parsers) simpler rather than
> harder.

I'm just saying that roles are given by the models. If you want to validate the a mapping block against a model you need the 
mapping to use the roles set by the model. This must work even in a context where elements of different models co-exists.

> 
>>      2) if the enclosing object has 2 attributes with a role ending with e.g. ".val"
>>         there is no way to distinguish them.
> 
> Uh... how could that happen?  You're saying the current DM landscape
> would allow a class like:
> 
>    class Composed(Parent1, Parent2):
>      val: string   # from Parent1
>      val: real     # from Parent2
> 
> If that is true, I'd strongly suggest we need to fix it.  Given the
> way all computer languages supporting record types work, that would
> be a huge bummer.
> 
> But frankly (and I didn't even bother to look it up in VODML), I don't
> think that's possible, and so I'd suggest that will not be a problem
> in practice.
> 

VODML has the capablity of putting together elements from different models (e.g. ivoa/Measure/Coordinates)
On top of this, MIVOT has the capability of referencing  different VODML models.
Each of those items has a specific dmtype defined by the model is comes from and a dmrole defined by the model using it.
Allowing MIVOT to redefine this roles would just bring confusion.

>> Working with shorten role could make sense for a client knowing the
>> classes it has to parse but doing this at the mapping level would
>> break the consistency of the annotations.
> 
> Well, I submit that almost always, the client will be parsing MIVOT
> with a good idea of what it would like to see ("a thing with a value
> and some sort of error specification", say).  I'd hence say it's a
> safe assumption it knows "I'm looking for instances of class X, which
> will have attributes value and error".

If your client is very comfortable with the mapped model, it could just replace role==x.y.z with role.endswith(z)
But please do not prevent naive clients to get lost because of shortened roles.

> 
> No, where I think the idea of the fully qualified attribute names
> came from was to lessen the pains of inheritance.  Say, a client
> doesn't know about RealQuantity but just about Quantity.  When we
> fully qualify the attribute names, it still can find attributes that
> are already part of Quantity.

The Quantity case, as a VODML component, has been solved (see above)

> 
> I will not speculate on whether or not that's a useful thing to do --
> as I said, we have far too little adoption of the whole thing to make
> a call.  But *if* we actually want to enable this kind of thing, we
> could go back to the proposal of annotating the class hierarchy used
> in the instance, perhaps by making dmtype an element rather than an
> attribute; that would still be a lot easier on annotators than the
> fully qualified attribute names.

The guidelines of the MIVOT design have been explained in section 4, and especially the role played by the XML elements vs their 
attributes.

> 
> Whether or not it's useful (and if you think it's important, at least
> provide a non-contrived prototype implementation of that), solving it
> through the fully qualified attribute names seems extremely unwise,
> as it explodes the annotation process.

I worked a lot on the anotation process, and this never bother me.
I'm always using components (XML snippets).
I see that my query result has a position? I insert a copy of the position snippet into the MIVOT block.
I have never seen any situation where using longer roles (qualified) make any trouble.

> 
> Let me try to explain again.
> 
> When people do the annotation, they'll look for an appropriate class
> and they'll find, say, RealQuantity and SymmetricGaussianError in
> some diagram and see there's the attributes value and error on
> RealQuantity, and fwhm on SymmetricGaussianError.  What I'd like them
> to be able to write in their resource descriptors is:
> 
>    (RealQuantity) {
>      value: @my_field
>      error: (SymmetricGaussianError) {
>        fwhm: @error_my_field
>      }
>    }
> 
> in order to link up the fields my_field and error_my_field.
> 
> That's, I think, relatively doable (at least if we improve our
> documentation a bit).
> 
> However, with fully qualified attribute names, either
> 
> (a) the annotator will have to follow the class hierarchies to figure
> out that, say, the value attribute is defined in the Quantity class
> and fwhm in, perhaps, the GaussianErrorLike class, and then write the
> "ivoa:Quanitity.value" in the annotation block.  I've tried this
> "follow the inheritance hierarchy" thing and I dispaired.  I simply
> don't want to hurl that at my users.
> 
> or (b), I (or rather, DaCHS) knows the VODML of all the classes
> involved and follows the inheritance hierarchies of all attributes it
> sees to the first class declaring them.  That's a possibility, but
> for this reference implementation, as I said, that's an order of
> magnitude more effort.
> 
> I think reference implementations are there to uncover such little
> snags where a requirement of questionable value causes an
> unproportional amount of work.  There's nothing wrong with having
> such snags in a working draft (or PR) -- writing specs is hard.

This is why we spent so much time to write that specification and to exercise it.

> 
> But it is, I claim, wrong to just carry them over to a REC once we
> *have* noticed them -- unless, of course, there is an overriding
> reason that justifies all that extra work.  But that I still don't
> see.
> 
> On the other hand, if that reason exists, we at least need to specify
> an implementable and clear algorithm that lets me compute all
> qualified attribute names for all attributes in a piece of VODML.
> For all I can tell, that is missing from MIVOT, so far -- the thing
> with "go up to the first class defining the attribute" is just my
> interpretation of the examples given.
> 
>            -- Markus

I'm not sure to follow you here.

Imagine that your annoter detects that the query result has a sky position.
We can assume that:
- it is able to build a Mivot representation of a Position object (e.g. with an instance of meas:LonLatPosition)
- it knows that Position objects require Errors

Unfortunately, it can not guess which concrete error must be used for that specific case.
This information can only be inferred from the dataset metadata. Neither VODML nor MIVOT can help for this.

If by some magic, the annotator discovers that for that particular dataset, the positional error is a meas:SymmetricGaussianError.
it just has to build an instance (I definitely prefer using preset components) of a meas:SymmetricGaussianError and put it at 
the right place in the Position mapping block.

I can understand the need of providing implementation tips, but I do not see the connection with using qualified names or not.

VOMDL gives you the model hierarchy with all related types and roles and MIVOT connects that stuff with data. I do not see any 
flaw in this process.

Laurent on behalf of the authors

--
English version: https: //www.deepl.com/translator
-- 
jesuischarlie/Tunis/Paris/Bruxelles/Berlin

Laurent Michel
SSC XMM-Newton
Tél : +33 (0)3 68 85 24 37
Fax : +33 (0)3 )3 68 85 24 32
Université de Strasbourg <http://www.unistra.fr>
Observatoire Astronomique
11 Rue de l'Université
F - 67200 Strasbourg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laurent_michel.vcf
Type: text/vcard
Size: 406 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20230213/34faf944/attachment.vcf>