UAT adoption

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Aug 24 13:37:25 CEST 2022


Dear Registry community,

Now that we have a proper vocabulary reflecting the UAT in the VO, we
can finally fix VOResource to actually explain what "use the UAT in
subject" means.

This is not super-urgent; we could thus go for VOResource 1.2 with
this.  However, I don't see a sizable number of other changes that
would warrant a new document version.  I would hence suggest that an
erratum should do it, too.

I've just written a proposal for one -- see below.  Feel free to
comment and suggest improvements (or to protest if you see major
flaws or insist on VOResouce 1.2).  If I hear nothing, I'll move the
thing on to the TCG in two weeks.

✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂

---++ Rationale

VOResource 1.1 says in the documentation of the subject element:

	Terms for Subject should be drawn from the Unified Astronomy Thesaurus
	(http://astrothesaurus.org).

This prescription is not suffient in practice; for many reasons, we
cannot really use the UAT concept URIs (for instance,
http://astrothesaurus.org/uat/11 for "The relative amount of a given
chemical element with respect to other elements") in VOResource.  The
label (in the example, "Abundance ratios") is not necessarily stable and
suffers from case and potentially punctuation issues.

To have a solid foundation for UAT use in VOResource, a specific scheme
has recently been endorsed in the VO, "Adopting the UAT as an IVOA
vocabulary", https://ivoa.net/documents/uat-as-upstream/.  This is what
should now be used in VOResource, and hence the document should contain
a pointer to the UAT adoption note.  This erratum introduces these
pointers and updates an example to the modern practice.

---++ Erratum content

In the example at the beginning of section 2 VOResource 1.1, replace:

    <subject>radio astronomy</subject> 
    <subject>data repositories</subject> 
    <subject>digital libraries </subject> 
    <subject>grid-based processing</subject> 

with:

    <subject>radio-astronomy</subject> 
    <subject>astronomy-software</subject> 
    <subject>astronomy-web-services</subject> 
    <subject>search-for-extraterrestrial-intelligence</subject> 

In section 2.2.3 "Language and Transliteration", replace "description,
title, subject", mentioned as examples of elements containing natural
language, with "description or title").

In section 3.1.3 "General Content Metadata", replace the Comment on
"Element subject" with:

	The content of subject SHOULD be a fragment identifier of the URI of a
	concept in the IVOA UAT (https://www.ivoa.net/rdf/uat/), that is, a
	string like "virtual-observatories".  For further details, see the
	IVOA endorsed note on Adopting the UAT for the VO,
	https://ivoa.net/documents/uat-as-upstream/.

In the XML schema delivered with VOResource 1.1, replace the content of
second xs:documentation element within the xs:element definition of
subject (line 694) with the comment text replaced into section 3.1.3.

---++ Impact Assessment

At the moment subject simply is not machine-readable and hence its
content is treated as plain text.  TOPCAT, for instance, translates
subject constraints into 

	LOWER(res_subject) like '%keyword%'.

These will obviously keep working as before (except if a data provider
actually had introduced upstream UAT URIs; none has, so far).

The syntax chosen in the UAT note – words separated by hyphens – also
makes queries using ivo_haswords work as before.  During the review
phase of this erratum, the TAP service at http://dc.g-vo.org/tap carries
table rr.uat_concept that reflects how rr.res_subject will look like
once the UAT migration is finished.  To illustrate the effects on
queries using haswords, try:

	select distinct uat_concept from rr.subject_uat
	where 1=ivo_hasword(uat_concept, 'radio')

there.

Hence, we do not expect noticable negative impact.  On the other hand, a
migration to the scheme forseen here will enable many useful
applications, starting from reliable keyword matching to semantics-based
query expansion to subject mapping for interdisciplinary metadata
repositories (cf.
https://blog.g-vo.org/semantics-cross-discipline-discovery-and-down-to-earth-code.html).

✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂✂

          -- Markus


More information about the registry mailing list