Vocabulary terms in MANGO

Wed Nov 6 13:57:44 CET 2024

Hi Semantics

Thank you Markus for this very complete answer.

I wrote a few comments below, but to be short I propose
to take the following actions:

- create a vocabulary named "property_qualifier"
     - "property" because this is the generic word used
       by MANGO for measurements/quantities/...
     - "qualifier" because I've no better option for something
       which specifies the nature of a property
- serialize it as "property_qualifier.desise"
     - is desise a suitable format?
- put this file in a 'vocabulary' folder into the MANGO projects
   to make it part of the RFC
- change the model accordingly
- open a PR with that stuff

Le 05/11/2024 à 18:32, Markus Demleitner via semantics a écrit :
> Hi Laurent,
> 
> On Tue, Nov 05, 2024 at 05:39:32PM +0100, Laurent Michel via semantics wrote:
>> In some places, this model uses literal enumerations to specify some property roles:
>>
>>      - Calibration level
>>      - Colour definition (magnitude ratio vs hardness ratio)
>>      - Shape definition (MOC, STCS...)
>>      - Possibly error distributions (poisson, gaussian...) (see issue #59)
>>
>>
>> This works fine, but has a major drawback: if some enumeration needs to be
>> updated, we would have to create a new model version via a new RFC process.
>> :-(
>>
>> This issue has already been discussed in the IVOA and there was a consensus
>> that vocabularies should preferably be used in data models AFAIR.
>>
>> My questions are:
>> ================
>>
>> 1) Do you still think it is better to replaces literal enumerations
>> with a vocabulary?
> 
> Generally, a vocabulary is good if you can write some sort of generic
> code to handle things; in-schema definitions are preferable when
> "deep" code changes are likely necessary to deal with some change.
> For instance, I doubt that any code will sensibly be able to deal
> with an addition to an enumeration starting with moc, stc-s,
> dali-shape.  Each of these items needs a profoundly different
> treatment with very little common code.
> 
> Against that, a new calibration level probably won't break any code;
> statistical distributions... are somewhere in between.

I was more focused in the need to make it easier to add new concepts
without breaking the model and thus the code using it,
but your criteria for using a vocabulary is also relevant.

> 
> An extra consideration: Whenever there is some implied hierarchy,
> vocabularies are particularly attractive, even when specialised code
> is necessary, for instance, if you can fall back to more general code
> for a more general term, perhaps losing some precision.  As an
> example, consider reference positions, where for light travel time
> corrections it is still useful to know something is in low earth
> orbit even if you don't have an HST ephemeris (say).
> 
>> 2) Should this vocabulary be specific to MANGO or should it have a
>> wider scope?
> 
> Vocabularies should be designed to model a particular part of the
> reality.  Anything dealing with that part of the reality should be
> able to use that model.  Hence: No vocabulary should be marked as
> Mango-private, and when designing them, please keep reusability in
> mind.

I got it. I propose "property_qualifier"
> 
> Since we have so many notions of calibration levels (see the EPN-TAP
> spec for how bad the situation is in solar system science alone), I
> would particularly like a shared vocabulary there.

OK, let's start we something simple.
e.g.
                    |- raw-data
calibration-level -|- ...
                    |- calibrated-data

The wider term tells what we ar talking about and the narrowers apply to specific cases.
I guess that a bit more of hierarchy may be usefull to merge concepts currently
defined in different contexts, but let's see that later.
> 
>> 3) Should it be included in some existent IVOA vocabulary?
> 
> Included?  Probably not.  If no existing vocabulary does what you
> need, create a new one.  Semantic resources in general are easier to
> use when they are small and cover only small parts of the reality,
> and become harder as their extension (the subset of the world they
> cover) grows.
> 
Looking at https://www.ivoa.net/rdf/, I see no obvious place for the terms I'm proposing.
The MANGO scope is to enhance the desciption of complex quantities exploded in table columns.
AFAIK, there is no vocabulary covering this field yet.

> On the other hand, you should avoid creating vocabularies that in
> some sense share concepts with other vocabularies without a very good
> reason (such a very good reason might exist, for instance, for having
> object-type in addition to the full UAT, because the restriction to
> object types may allow you to use a more rigorous formalism).
> 
>> 3) What is the process to create a new vocabulary (a simple VEP?)?
> 
> You write down the terms and definitions, ask here for opinions
> and explain in your WD how to use the vocabulary.  After that, we
> upload the vocabulary to the vocabulary repo with all terms marked as
> preliminary, and it is being developed alongside your WD; it can still
> be arbitrarily edited and even be withdrawn at that point.  It is
> being community-reviewed together with the PR and as part of your
> standard's RFC.  When your standard becomes REC, the vocabualry
> becomes stable, and only then do VEPs come into play.

I think the actions listed above apply this roadmap to the context of a model RFC.
If not tell me.

> 
>              -- Markus
> 

Laurent