Versioned standardIDs for standard vocabularies? Also, including IVOA attributes in Avro schemas

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Fri Mar 20 08:57:18 CET 2026


Dear Gregory,

On Thu, Mar 19, 2026 at 04:55:43PM +0000, Dubois-Felsmann, Gregory P. wrote:
> You acknowledged that UCDs don't seem to be RDF-izable, and
> presumably that applies to VOUnits as well, given the complex rules
> for composing unit strings from the defined atoms?  So for VOUnits
> as well, the only formal identifier to use would be the
> ivo://ivoa.net/std one?

Yes, for units it's even clearer.

> You wrote:
> > That way, it turns out *one* way to reference UCDs is
> > ivo://ivoa.net/std/UCD (which is the UCD standard rather than the UCD
> > list; I don't think the latter has a standards record yet, but I also
> > think the base spec is the better target).
>
> And later:
>
> >UCDs, at this point, aren't a proper IVOA vocabulary, and will never
> > exactly be, because they have a grammar that's (I think) impossible
> > to map into RDF.  So, for them I'd suggest "This is a ucd" would
> > always mean referencing ivo://ivoa.net/standards/ucd.
>
> Are both of those URIs equally valid ("std/UCD" vs.
> "standards/ucd")?  Is there a distinction between them we should be
> aware of?

Sorry.  Ahem.  It's of course always ivo://ivoa.net/std/ucd (use the
trick of prepending https://dc.g-vo.org/I; an ivoid isn't one if it
doesn't resolve).  The URI with "standards" was just me being half
asleep.

However, regrettably ivoids are indeed case-insensitive, so
ivo://ivoa.net/std/ucd and ivo://ivoa.net/std/UCD must compare equal
in clients; for interoperability, my recommendation is to write them
all-lowercase always, and to only use all-lowercase fragment
identifier (which aren't case-insensitive; let's never do anything
new that's case-insensitive again, shall we?).

> Apart from some Registry-specific restrictions that you helped
> explain about how the specific "#endpoint-version" fragment
> identifiers can be used, can you summarize why you think that a
> vocabulary identifier should not contain a version number?  Is this
> because there's an implicit (or perhaps explicit, that I haven't
> absorbed) policy that vocabulary terms will never be withdrawn?

Very briefly: vocabularies are collections of concepts, which are
sets of "things" (which includes relations and such, which are called
"properties" here).  Their purpose is that machines can (to some
extent) reason about these concepts: If server X says: "A is a
progenitor of B", then client Y should reliably be able to figure out
that its code for progenitors should be executed rather than, say,
the one for documentation.

Now, our vocabularies are designed to change.  For instance, we might
one day need to distinguish progenitors with science data and these
having calibration data.  That way, the vocabulary changes.  Note,
however, that the concept #progenitor does not change, there are just
two narrower terms.  This constancy of concepts is the central
property we try to guarantee.

If we *were* to "version" vocabularies in the sense that their
identifiers (i.e., URIs) change (perhaps:
"http://ivoa.net/rdf/datalink/core-1, -2, -3"), then the identifiers of
the concepts would change, too
("http://ivoa.net/rdf/datalink/core-[1,2,3]#progenitor").  Sure, we
could declare that all of these are the same thing in RDF, but for a
client to figure this out, we'd need a lot of network requests and
reasoning.

Against that, there really is not much of a use case for having
versions (basically: some sorts of debugging) in vocabularies; after
all, the concepts can't change anyway.  Given this, multiple URIs per
vocabulary would be much to high a price to pay.

If you *really* forsee the need to figure out the current version,
you can note down the URI a given concept URI redirects to *at a
specific time*.  For instance, at this point

http://www.ivoa.net/rdf/datalink/core#progenitor

redirects to

http://www.ivoa.net/rdf/datalink/core/2022-01-27/datalink.html#progenitor

But it is very central to RDF use that while the former URI carries a
semantic meaning that clients can understand, the latter does not.
You can note it down somewhere in case you suspect that we'll get it
wrong later and butcher the concept, but that's basically all it's
for.

Thanks,

       Markus



More information about the semantics mailing list