Comments on Vocabulary 2.0

Fri Aug 28 18:19:18 CEST 2020

Dear Franck,

Thanks for reviewing the Vocabularies 2 (VocInVO 2) WD.

On Fri, Aug 28, 2020 at 05:19:01PM +0200, Franck Le Petit wrote:
> 1 - Vocabulary 2.0 seems to imply that only official IVOA
> vocabularies can be used. 

Well, it is certainly not intended to say that, and I'm happy to
include any language you find necessary to make that clear.

Of course, you can adopt any sort of RDF tech that is useful for a
particular software or even standard, and use any sort of semantic
resources -- even full ontologies, say.  

However, in that case there is (probably) no need to have any sort of
IVOA standard, as W3C RDF would suffice.

Vocabularies 2 tries to cover specific use cases requiring consensus
vocabularies with certain usage properties.  Unsurprisingly, that
results in relatively strict requirements, intended to make it easy
for clients.  

If you can live without these requirements, that's fine.

> for the description of simulations and astrophysical codes,
> according to SimDM, many quantities need to be defined but are
> unique to some specific codes. The most obvious example are Input
> Parameters of codes, that often are specific to codes. It would be
> complicate to have these concepts in general IVOA vocabularies and
> to follow the procedure of recommendation of vocabularies described
> in Vocabulary 2.0. Other quantities are more general and can fit in

...and you probably don't have use cases that entail the strict
requirements for these.  Out of curiosity (also with a view to what
me may still have missed for VocInVO 2): Is there any software that
already uses these Vocabularies?  If so, what does it do with them?
Did you run into specific problems that might generalise into what we
are trying?

> "We find ourselves in the situation where there are multiple
> vocabularies in use, describing a broad range of resources of
> interest to professional and amateur astronomers, and members of
> the public. These different vocabularies use different terms and
> different relationships to support the different constituencies
> they cater for. … 

You are right that we dropped this outlook (see the introduction) --
in short, the reason is that it would effectively kill most of our
use cases, and at least in astronomy we're not aware anyone has
tried, let alone successfully used, such mappings so far.  Perhaps
that's happened in Theory? If so, I'd again be grateful to hear your
experiences.

That's not to say we'll never do mappings like these; for instance, I
think we want, if only informationally, links between an object type
vocabulary (if we ever have such a thing) and the UAT, and we're
already linking the upstream UAT and our (preliminary)
VocInVO2-compatible rendering.  

But by now I'm fairly convinced that expecting machines to do
any sort of inference based on this mapping is something that'll have
to wait until we have strong AI.  And there's not more than half of
my tongue in my cheek here.

> 2 - Vocabulary 2.0 seems to imply that each concept must have a
> proper definition. 

That is indeed what it requires.  I do believe that as soon as
multiple parties use the same vocabulary, single terms just aren't
good enough to convey meaning.  And semantics, in the end, is about
conveying meanings.

If there's one thing I'll defend with teeth and claws in the
proposal, it is the "must" on descriptions (and Mark Taylor has had a
hard time wearing me down on formulating all kinds of constraints
trying to force people not to cut corners on them).

Having said that, if you use your vocabularies in ways that
bare terms work for you, that's fine -- in that case you simple
don't need the complications VocInVO2 introduces to ensure
interoperability.

> - those gathering scientific concepts. Ofter they are large
> vocabularies, used in specific science domains and belong to
> disciplinary knowledge. Their definitions are known and defined in
> theses communities and so do not have to be managed at the level of
> the IVOA.

There is actually a provision for that in VocInVO2, following use
case 2.1.9.  I'm not overly happy with the whole thing, but then
we've bought into the UAT in VOResource, and there's clear use cases
for having a UAT with VocInVO2 guarantees.

Be that as it may, again I'd say "if you don't need an IVOA
vocabulary and want something else, by all means use that something
else".

> 3 - Vocabulary 2.0 seems to not support SKOS ALT labels. ALT label
> is a basis of web-semantics since it allows a system to manage
> synonyms. ALT label is used in implementations of SimDM. ALT labels

Ummmmmm... No.  Labels are for humans.  Alt labels may be useful in
supporting people in annotation, but if you really want to express
synonymity in a machine-readable way, use different resources
("terms" in VocInVO 2 lingo) and link them with the appropriate RDF
predicates (skos:exactMatch in this case).

The reason I'm reserving skos:altLabel for *IVOA* vocabularies is
that I'd like clients to display labels rather than the
machine-readable terms (that's use case 2.1.6).  Therefore, I want to
make figuring out the labels as simple as possible.  Therefore,
clients shouldn't have to wonder which of several strings to choose.
Or perhaps to display all of them?

Now, it wouldn't be a killer to allow altLabels (clients could easily
ignore them in desise), but I'd have to have an actual use case
requiring them in our consensus vocabularies -- fewer features are
usually a win for a standard.

Meanwhile, don't worry that your external vocabularies might somehow
become uneligible for IVOA endorsement if they have altLabels --
even if no unforseen reason to have them (which is the reason they're
reserved at that point) comes up until the question would be reaised,
the altLabels could just get stripped they way we're doing it for the
UAT.

> In conclusion, Vocabulary 1.19 has been created with the goal to
> start to introduce web semantics in VO services. The next steps,
> would have been to work on the links / mappings between
> vocabularies and on the versioning of vocabularies. 
>
> Ref: - Vocabulary 1.19, section 5 : "Part of the motivation for
> formalising vocabularies within the VO is to support mapping
> between vocabularies, so that an application which understands, or

Yeah, that was a nice idea, but it basically kills use case 2.1.8
(offline operation), and I've not yet found a way to make this kind
of thing happen without letting 2.2.8 go down the drain (and eve with
RDF tooling: if you've figured out a way to do something sensible
with these kinds of mappings, I'd truly be interested).

Dropping this whole outlook is, I'd say, the main reason why we're
not doing VocInVO 1.4 but VocInVO 2.0.

> Here, Vocabulary 2.0 seems to no more put the focus on web
> semantics but instead to have strong and rigorous rules to manage
> and use “simple” vocabularies. The goals seem different. I think
> that if a new standard is required, it should not break what has
> been done before and on which recommendations, as Theory I.G. ones,
> rely on. 

Well, breaking things is what major versions are for.  But of course
we don't want to get Theory into trouble, so: What exactly is it that
you'd need kept from VocInVO 1 in terms of regulations?  If all you
want is a licence to use SKOS the way you want to use it -- there's
no need for that (although as said above I'd happily include language
making that clear if you prefer).

> A solution could be to develop a bit Vocabulary 2.0 to distinguish
> between two kinds of vocabularies: those strongly tied to IVOA
> standards and infrastructures and those that describe scientific
> concepts. The first one, as the Datalink vocabulary, can follow the
> specifications described in Vocabulary 2.0, in particular a
> definition is expected for each concept and they must be approved

But why would we even constrain that second sort?  What would you
want VocInVO 2 say about them?

> by IVOA people as the Semantics W.G. group. I am note sure the TCG
> is the proper level because all chairs of W.G. might not feel
> concerned. The latter one can have very long lists of concepts
> specific to scientific domains. The rules must be more flexible for
> those ones and they should be managed by scientists experts of the
> field. 

Oh, the reason the TCG should have a final say on all terms added is
that vocabularies are intended to be shared between standards --
there's few things more annoying that different IVOA standards doing
the same things in slightly different ways.  Hence, all WG chairs
should have a chance to stop something that will rain on their
parade.

Thanks,

              Markus