VOResource 1.1: Remaining vocabularies

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu Nov 3 11:30:06 CET 2016


Dear Registry community,

In the ongoing effort to work out the details of what's going into
VOResource 1.1, here's a call for opinions on the two remaining
vocabularies.  One is the roles in the lifetime of a resource (values
of date/@role):

http://docs.g-vo.org/vocab-test/date_role

This list essentially adopts DataCite's dateType enumeration but
keeps the 1.0 terms, mainly to declare the intended upgrade path --
they're deprecated with a recommendation to migrate them in this way:

VOResource 1.0           DataCite

representative           Collected
creation                 Created
update                   Updated


You could argue that mapping representative to Collected is a bit
daring.  I'm open to other proposals (including letting it stand as
is).  I'm not so wild about deviating from DataCite here, though, and
I'm pretty sure DataCite would be unhappy with a proposal for such a
catch-all term.

Also note that in VOResource 1.0, this was intentionally open, so we
already have a fairly wide range of terms out.  Here's what

  select value_role, count(*) as ct 
  from rr.res_date 
  group by value_role;

gives in today's registry:

      value_role       |  ct   
-----------------------+-------
                       |  1063
 created               |    60
 project end           |     1
 representative        |     8
 creation              | 15540
 updated               |   254
 release               |    14
 update                |   269
 last-checked          |    10
 availability          |     1
 project start         |     1
 authority established |     1

(where empty corresponds to the dreaded representative).  Hm.
last-checked, at least, looks like something useful ("The date the
resource content was last reviewed for validity, topicality,
completeness, or similar2).  Say a word and I'll put it in; to keep
with DataCite style, I'd propose the term "Validated" then.

"project start", "project end", and "authority established" have no
corresponding concepts in the proposed vocabulary.  I'd say that's
fine.


The last vocabulary that would need review is the one for
content/contentLevel, http://docs.g-vo.org/vocab-test/content_level

This used to be a strictly controlled vocabulary with a fairly
fine-grained model: 

  "General" | "Elementary Education" | "Middle School Education" |
  "Secondary Education" | "Community College" | "University" |
  "Research" | "Amateur" | "Informal Education"

(zero or more of these could be given). When the Edu IG wanted to
make use of this, it turned out that because of this wide choice,
existing annotation was so inhomogeneous as to be useless.

That is why we (in this case, it's Marco's and mine thing) now
propose just three terms, Resarch, Amateur, and General.

One *could* argue we should include the "old" terms and explicitly
deprecate them.  However, since the terms contain blanks, that's
technically at least inconvenient, and it clutters the list without,
I'd say, helping anyone proportionally, machines included.

Also, as with all these vocabularies, using terms from the vocabulary
is a SHOULD (i.e., validators will emit warnings instead of errors if
people use non-vocabulary terms) rather than a MUST.  Hence, even
without the old, fine-grained terms 1.0 records don't become invalid
(as is required with a point update to the spec).


As usual, I'm grateful for any kind of opinions or comments (which
includes a moderate amount of "go ahead already" if appropriate).

        -- Markus


More information about the registry mailing list