VOResource 1.1: altIdentifier form

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Nov 28 15:52:37 CET 2016


Dear collagues,

In the ongoing series of open VOResource 1.1 points, this is about
the model of altIdentifier, for which we didn't reach a confident
consensus in Trieste.

In the current draft, altIdentifier is allowed as a child of
vr:Resource itself (where it could house DOIs or, e.g., for VizieR
services that already have them, bibcodes) and of vr:Creator (where
it could house an ORCID)[1].  The content model simply is xs:anyURI.

So, altIdentifier can contain several sorts of identifiers.  The
proposed understanding is that as people start using some identifier
scheme, they will have to agree on a URI form of their identifiers.

For the identifier types envisioned so far, these are already
defined; for instance, you'd have:

  <altIdentifier>doi:10.21938/puTViqDkMGcQZu8LSDZ5Sg</altIdentifier>
  <altIdentifier>bibcode:2010yCat.1317....0R</altIdentifier>
  <altIdentifier>orcid:0000-0000-0000-000X</altIdentifier>


I would argue that there's not terribly much that can go wrong there.
However, DataCite models their alternate identifiers differently;
they split off the "identifier literal" from the identifier scheme,
at least in general.

The three examples above would, for them, look something like this:

  <altIdentifier
    scheme="doi">10.21938/puTViqDkMGcQZu8LSDZ5Sg</altIdentifier>
  <altIdentifier scheme="bibcode">2010yCat.1317....0R</altIdentifier>
  <altIdentifier scheme="orcid">0000-0000-0000-000X</altIdentifier>

I have to say I'm not quite sure why they did it, and I think it's
really a bit dangerous.  For instance, I'm not convinced that people
will always leave out the scheme.  And if you then have

  <altIdentifier
    scheme="doi">doi:10.21938/puTViqDkMGcQZu8LSDZ5Sg</altIdentifier>

comparisons, perhaps in databases, become painful.  Then when
matching against these, you'll have to remember to match both scheme
and literal (also making your queries clumsier).  Finally, if you have
actual URLs -- as in IVOIDs, ivo://example.edu/id --, should this be

  <altIdentifier scheme="ivoid">example.edu/id</altIdentifier>

then?  And if it were

  <altIdentifier scheme="ivoid">ivo://example.edu/id</altIdentifier>

why bother with the extra attribute?

Of course, there's the old wisdom to avoid parsing flat strings
whenever possible.  In particular for identifiers, I'm finding I've
lost my belief here to some extent.  This is partly due to the
experience of IVOA identifiers version 1
(http://ivoa.net/Documents/IVOAIdentifiers/20070302/index.html),
which defined a "pre-parsed" XML form of IVOIDs.  To my knowledge,
this has never been used (except to confuse readers of that
document).


Well, after this bit of non-subtle, manipulative suggestion: Does
anyone want to speak up for the DataCite model with an explicit
scheme (or whatever) attribute?  Have I perhaps overlooked an
important reason why URI forms will give us headache?

Cheers,

         Markus



[1] Incidentally: Does anyone want altIdentifier on other elements?


More information about the registry mailing list