MIVOT: fully qualified attribute names

Wed Mar 8 12:20:41 CET 2023

> On 7 Mar 2023, at 22:52, Laurino, Omar <olaurino at cfa.harvard.edu> wrote:
> 
> I really hope I didn't miss some connections in this long thread, but...
>  
>> The real point is that the VODML-ID "coords:TimeFrame.timescale” could be "coords:a.b" according to the VO-DML standard - there is no connection between the vodml-id and the name of the model element as defined in the standard - I want to make the connection, and once the connection is made, the VODML-ID is redundant as it can be generated from the model structure.
> 
> A change could be made to the VODML spec to make the vodml-id generation a requirement rather than a preference, by promoting Appendix C to normative state. And while I remember believing that both approaches (full vodml-id or just name) would work, as long as the mapping provides enough markup to make the references unambiguous, I did have a preference for the full vodml-id for two reasons: 1. because explicit is better than implicit and 2. because it is more future-proof.

I had not noticed Appendix C - so yes I would support making that normative, and moving it out of the appendix - that does at least part of what I am arguing for. In fact I am more concerned by this happening rather than my slightly more controversial desire to remove VODML-IDs….they would just be repeated information then.

> If I understand Paul's point correctly, I'd like to point out that the reason for having the entire vodml-id was to make sure that a model's element could always be identified unambiguously in any context, in particular when extending models. VODML allows data providers to extend a type (section 4.6.1). When they do, parsers need a way to identify fields in an unambiguous way, which includes mapping them to the model document where they are defined.
> 
> In that sense, the vodml-id becomes redundant not only if one makes the connection with the name, but also if a mapping scheme defines a way to represent extensions that provides that unambiguous mechanism. If an instance is of type <custom:MyType> (which extends standard:Type), one would have attributes identified by <custom:MyType.myAttribute> and <standard:Type.attribute> within that instance, which the parser could map to the respective definitions without having to rely on any heuristics or complex logic.
> 
> If one has a <custom:MyType> instance with attributes <myAttribute> and <attribute> the parser wouldn't really know where to look them up unless the connection between <custom:MyType> and <standard:Type> is made explicit in the serialization markup. And even in that case, since the parser doesn't know whether myAttribute is defined in custom: or in standard: it'll have to try both.

Agreed that you would have to traverse the hierarchy, but I am not convinced that much value is obtained from having a data model representation unless you are prepared to do that. Even if you have some code that does a simple switch-case-statement-like string match on VODML-IDs to do “stuff” for a standard:Type, how are you going to arrange things for the custom:MyType? It quickly gets messy if you just keep adding to the monster “global” switch statement for all the possibilities. It is actually not impossible that you might want to do something different for <attribute> when it appears in custom:MyType compared with when it appears in standard:Type.

> 
> People have argued in the past that inheritance requires parsers to have complex type algebra, which may be true depending on the use case and of the mapping strategy. However, extensibility was one of the main requirements for VODML. A mapping strategy can minimize that effort by identifying an instance as both custom:MyType and standard:Type. And since we recommend vodml-ids to be generated algorithmically, a parser could decide to ignore model definitions completely, and parse the vodml-ids to display the attribute names, which would be human-readable. Other parsers would be interested in the unambiguous identification of attributes to provide richer context-dependent features to client software.
> 

I think that identifying attributes from an inheritance hierarchy would only become “difficult” (i.e. requiring more than the name) if VO-DML allowed multiple inheritance. However it does not, so I do not think that the extra naming is necessary if there is a rule that attribute names have to be unique within the parents of the hierarchy - the only case where overrides can occur is explicitly dealt with by subsetting.

> 
> A reference to a full vodml-id is always going to unambiguously identify a single element, like a URI. I can go from custom:MyType.myAttribute to myAttribute and from standard:Type.attribute to attribute, but I can't go from myAttribute to custom:MyType.myAttribute without some effort parsing definition documents.
> 
So here I concede that more parsing of the definition documents would be needed without the VODML-ID - however given that the definition documents are XML it is necessary to do more than simple string matching to have any sort of robustness, so the documents would need to read as DOM at a minimum and then traversing the hierarchy looking for a particular element from a VODML-REF is not so much more effort. If appendix C were mandatory, then my objection to VODML-IDs is on the DRY principle. To create an in-memory index for referencing the various elements, the key could be formed either by just reading the VODML-ID or by constructing it from the hierarchy path.

Paul.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20230308/40111305/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2893 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20230308/40111305/attachment.p7s>