Vocabularies in the VO 2.0, first working draft

Fri Sep 6 10:24:50 CEST 2019

Dear Semantics WG,

The first working draft of the new Vocabularies in the VO 2.0 has
just hit the document repository at
http://ivoa.net/documents/Vocabularies/ (I suspect that URI might yet
change, but you'll get to the document from the index page in any
case).

While the basic features from the first internal WD I announced in
http://mail.ivoa.net/pipermail/semantics/2019-July/002626.html
haven't changed, a few details did; the most salient points are:

* Now requiring RDF class and property vocabularies to have tree-like
  hierarchies.
* SKOS now uses wider rather than narrower so its direction matches
  that of the RDF c/p flavours.
* Added the common property ivoasem:vocflavour, mainly to let clients
  figure out what vocabulary they're dealing with.
* Added sample code to parse the vocabularies ("revovo") both as a
  proof of concept and as the seed of a validator.
* Added XML advice (and an example) for the common properties.
* Requiring absolute URIs in RDF/XML.
* New use case/requirement on offline operation.
* Now requiring RDF/XML to use typed node elements for our term classes.
* No longer linking ivoasem:deprecated and ivoasem:useInstead to
  roughly matching owl terms (that's technically a bit challenging and
  probably not worth it).

The WD already comes with a bit of software (cf. also the volute
repo, where there's also some testing code) in case you want to play
around with the existing vocabularies.  Note that of the "theory" SKOS
vocabularies, only Algorithms has been moved to the new scheme;
AstronomicalObjects, DataObjectTypes, PhysicalProcesses, and
PhysicalQuantities I've left alone pending further consultation with
the Theory IG, so don't point revovo at those.

I would particularly appreciate any feedback on the interoperability
of our vocabularies with general RDF tools.

All this might seem fairly finished to you, but in fact, I went into
all the trouble of implementing as much as I did because in this
particular case, I am really uncertain whether I'm sitting in about
the right spot between the poles of simplicity, future-proofness,
preservation of existing practice, and adoption of external
standards.

For instance, if we just dropped the "requirement" (I'm using quotes
because the document doesn't have a use case for that) that we use
RDF and used some ad-hoc XML or JSON, the whole thing would become a
lot less complex (we're not solving *really* difficult problems
here).  Of course, nobody else would understand our stuff, and
because of UAT we won't escape SKOS, but perhaps that's a price worth
paying for simplicity?

Even if we keep adopting RDF, we could still simplify the whole thing
by just having one, VO-specific, "wider" property and forgetting
about SKOS, classes and properties.  This won't hurt us (or so I
guess) until we try to do linked data or something like that, for
which we have no use case yet.

Or perhaps we should go even further in the RDF direction -- the
distinction between Class and Property vocabularies was really only
done to ensure that if we ever want to go full ontology and we'd add,
say, a "isInFrame" property to the class vocabulary with the
reference frames, that thing doesn't suddenly pop up in lists of
frames.  But if we more or less required RDF tooling, there would of
course be much better ways to do that, and we'd get rid of most of
the "flavour" uglyness.

Other constraints that sound complex are just due to the one
requirement that we'd like to let people use the vocabularies without
RDF tooling, i.e., with just an XML parser (it's what that software I
wrote does, too).  Now, RDF/XML is deep from the era of XML
euphoria and is a fairly convolved way to represent what's, in its
core, a reasonably straightforward data structure.  

One could make a valid point that perhaps at least going through
RDF/XML can be dropped and we should tell people to go parse Turtle
(which isn't *soo* hard and has the clear advantage that you see
right away what's going on).

So -- have a look and let me know your thoughts.  Of course, opinions
from consumers of these vocabularies (applications, validators,...)
are particularly welcome.

Next steps?  Well, I'll tell you about the thing in the semantics
session in Groningen, and unless you stone me on that occasion, I'd
then right away try a first VEP on the basis of a few datalink terms
that have been lying around for too long.  If you have other terms
you'd like to see in vocabularies, don't hesitate and write your own
VEPs.  The more parties try the stuff, the less likely it is that we
miss some major blunder.

Thanks,

         Markus