UAT in VOResource

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu May 28 08:47:06 CEST 2020


Hi,

On Wed, May 27, 2020 at 02:01:53PM +0200, Baptiste Cecconi wrote:
> good to see this going forward. I've recently explored a bit the
> usage of the UAT while making up Datacite metadata for data
> collection DOIs at PADC. 
> The Datacite model implements a valueURI, and schemeURI and a
> value. Exemples:
> 
> <subject valueURI="http://astrothesaurus.org/uat/1338" schemeURI="http://astrothesaurus.org">Radio astronomy</subject>
> <subject valueURI="http://astrothesaurus.org/uat/1426" schemeURI="http://astrothesaurus.org">Saturn</subject>
> 
> The good thing with this is that I can refer to another thesaurus
> or another URI for terms I can't find in the UAT (for instance, the
> name of the space mission). 

Yes -- full RDF clearly is a good idea, and that is why we offer our
vocabularies in standard RDF (XML and turtle).  However, if you want
to do anything interesting with this (starting with comparing terms;
URI equality is a hairy topic), using full URIs becomes really
complex rather fast, and then you really need specialised libraries,
and that I'd like to avoid for the basic use cases.  Most of the
clients that hopefully will take that up are totally disinterested in
semantics as such, and convincing them to pull aboard a hefty
dependency just to deal with a few keywords at least in the past
hasn't worked.

There are also a few other requirements in VocInVO2 (2.2.11, for
instance) that made me go for "don't bother with full URIs in normal
VO use cases".

[I'm happy to discuss this in more depth, but let's keep that
sub-thread to semantics]

> Since I'd like to keep it open to use external terms, then the use
> of terms outside UAT (or its VO flavour) should require the same
> kind of metadata as in the Datacite model (i.e., an valueURI, and
> schemeURI and a human-readable value). Example (excerpt from

The problem is that this kind of thing immediately kills several use
cases I care about; validation, for instance, where you'll get a
warning if you put in non-UAT terms.  Or query expansion:

> another datacite DOI record of mine):
> 
> <subject valueURI="https://nssdc.gsfc.nasa.gov/nmc/spacecraft/display.action?id=1997-061A" schemeURI="https://nssdc.gsfc.nasa.gov/nmc/">Cassini Orbiter</subject>

[leaving aside the question if I think it's a good idea to put
our instrument or facility metadata into subject]

One thing I'd like to do with this is an ADQL function (say)
expand_special(vocabulary_name, term), so people can say

...where subject in expand_special("uat", "variable-star")

and then get everything that's more special than variable-star, too
(full disclosure: for SKOS vocabularies, there's dragons there, which
is the major reason why we're encouraging something more formal for
our own vocabularies, but that's beside the point here).

I couldn't possibly go from, say uat:space-probe to
https://nssdc.gsfc.nasa.gov/nmc/1997-061A, even if they had some RDF
there, and even the other way (from
https://nssdc.gsfc.nasa.gov/nmc/1997-061A to uat:space-probe) would
at least require quite a bit of network activity, and avoiding that
is what Vocabulary requirements 2.2.11 is about.

So... I understand why you'd like to open things up.  But I'd say the
price we'd have to pay in terms of what we can reasonably do with our
metadata is a lot too high.

         -- Markus


More information about the registry mailing list