VOResource 1.1: Remaining vocabularies
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Thu Nov 3 11:30:06 CET 2016
Dear Registry community,
In the ongoing effort to work out the details of what's going into
VOResource 1.1, here's a call for opinions on the two remaining
vocabularies. One is the roles in the lifetime of a resource (values
of date/@role):
http://docs.g-vo.org/vocab-test/date_role
This list essentially adopts DataCite's dateType enumeration but
keeps the 1.0 terms, mainly to declare the intended upgrade path --
they're deprecated with a recommendation to migrate them in this way:
VOResource 1.0 DataCite
representative Collected
creation Created
update Updated
You could argue that mapping representative to Collected is a bit
daring. I'm open to other proposals (including letting it stand as
is). I'm not so wild about deviating from DataCite here, though, and
I'm pretty sure DataCite would be unhappy with a proposal for such a
catch-all term.
Also note that in VOResource 1.0, this was intentionally open, so we
already have a fairly wide range of terms out. Here's what
select value_role, count(*) as ct
from rr.res_date
group by value_role;
gives in today's registry:
value_role | ct
-----------------------+-------
| 1063
created | 60
project end | 1
representative | 8
creation | 15540
updated | 254
release | 14
update | 269
last-checked | 10
availability | 1
project start | 1
authority established | 1
(where empty corresponds to the dreaded representative). Hm.
last-checked, at least, looks like something useful ("The date the
resource content was last reviewed for validity, topicality,
completeness, or similar2). Say a word and I'll put it in; to keep
with DataCite style, I'd propose the term "Validated" then.
"project start", "project end", and "authority established" have no
corresponding concepts in the proposed vocabulary. I'd say that's
fine.
The last vocabulary that would need review is the one for
content/contentLevel, http://docs.g-vo.org/vocab-test/content_level
This used to be a strictly controlled vocabulary with a fairly
fine-grained model:
"General" | "Elementary Education" | "Middle School Education" |
"Secondary Education" | "Community College" | "University" |
"Research" | "Amateur" | "Informal Education"
(zero or more of these could be given). When the Edu IG wanted to
make use of this, it turned out that because of this wide choice,
existing annotation was so inhomogeneous as to be useless.
That is why we (in this case, it's Marco's and mine thing) now
propose just three terms, Resarch, Amateur, and General.
One *could* argue we should include the "old" terms and explicitly
deprecate them. However, since the terms contain blanks, that's
technically at least inconvenient, and it clutters the list without,
I'd say, helping anyone proportionally, machines included.
Also, as with all these vocabularies, using terms from the vocabulary
is a SHOULD (i.e., validators will emit warnings instead of errors if
people use non-vocabulary terms) rather than a MUST. Hence, even
without the old, fine-grained terms 1.0 records don't become invalid
(as is required with a point update to the spec).
As usual, I'm grateful for any kind of opinions or comments (which
includes a moderate amount of "go ahead already" if appropriate).
-- Markus
More information about the registry
mailing list