VOResource 1.1: altIdentifier form

Accomazzi, Alberto aaccomazzi at cfa.harvard.edu
Mon Dec 5 23:07:28 CET 2016


Hi Markus,

Sorry for not replying earlier but I wasn't sure what to recommend myself.
Thinking back to what we've been doing when parsing such metadata, history
shows that you end up implementing a number of heuristics since people are
sloppy.  So while I might prefer to have the format

<altIdentifier scheme="myscheme">foobar</altIdentifier>

Then I would probably also worry about parsing @scheme to check for things
such as case-insensitive matching against a controlled list, matches
against a schemeURI.  And as far as the identifier itself "foobar": I'd
look for the presence of a scheme prefix there, possibly including a URI
prefix (particularly for things such as ORCIDs where now users are
encouraged to use a URI rather than just the string).

So my advice is that we should just pick what is simplest at the moment and
not worry too much about strict compliance with the way DataCite has gone
on this particular issue.  Crosswalking these records to DataCite means
that a transformation is necessary anyway, and so long as this is easily
doable I don't see a big problem.

-- Alberto



On Mon, Nov 28, 2016 at 9:52 AM, Markus Demleitner <
msdemlei at ari.uni-heidelberg.de> wrote:

> Dear collagues,
>
> In the ongoing series of open VOResource 1.1 points, this is about
> the model of altIdentifier, for which we didn't reach a confident
> consensus in Trieste.
>
> In the current draft, altIdentifier is allowed as a child of
> vr:Resource itself (where it could house DOIs or, e.g., for VizieR
> services that already have them, bibcodes) and of vr:Creator (where
> it could house an ORCID)[1].  The content model simply is xs:anyURI.
>
> So, altIdentifier can contain several sorts of identifiers.  The
> proposed understanding is that as people start using some identifier
> scheme, they will have to agree on a URI form of their identifiers.
>
> For the identifier types envisioned so far, these are already
> defined; for instance, you'd have:
>
>   <altIdentifier>doi:10.21938/puTViqDkMGcQZu8LSDZ5Sg</altIdentifier>
>   <altIdentifier>bibcode:2010yCat.1317....0R</altIdentifier>
>   <altIdentifier>orcid:0000-0000-0000-000X</altIdentifier>
>
>
> I would argue that there's not terribly much that can go wrong there.
> However, DataCite models their alternate identifiers differently;
> they split off the "identifier literal" from the identifier scheme,
> at least in general.
>
> The three examples above would, for them, look something like this:
>
>   <altIdentifier
>     scheme="doi">10.21938/puTViqDkMGcQZu8LSDZ5Sg</altIdentifier>
>   <altIdentifier scheme="bibcode">2010yCat.1317....0R</altIdentifier>
>   <altIdentifier scheme="orcid">0000-0000-0000-000X</altIdentifier>
>
> I have to say I'm not quite sure why they did it, and I think it's
> really a bit dangerous.  For instance, I'm not convinced that people
> will always leave out the scheme.  And if you then have
>
>   <altIdentifier
>     scheme="doi">doi:10.21938/puTViqDkMGcQZu8LSDZ5Sg</altIdentifier>
>
> comparisons, perhaps in databases, become painful.  Then when
> matching against these, you'll have to remember to match both scheme
> and literal (also making your queries clumsier).  Finally, if you have
> actual URLs -- as in IVOIDs, ivo://example.edu/id --, should this be
>
>   <altIdentifier scheme="ivoid">example.edu/id</altIdentifier>
>
> then?  And if it were
>
>   <altIdentifier scheme="ivoid">ivo://example.edu/id</altIdentifier>
>
> why bother with the extra attribute?
>
> Of course, there's the old wisdom to avoid parsing flat strings
> whenever possible.  In particular for identifiers, I'm finding I've
> lost my belief here to some extent.  This is partly due to the
> experience of IVOA identifiers version 1
> (http://ivoa.net/Documents/IVOAIdentifiers/20070302/index.html),
> which defined a "pre-parsed" XML form of IVOIDs.  To my knowledge,
> this has never been used (except to confuse readers of that
> document).
>
>
> Well, after this bit of non-subtle, manipulative suggestion: Does
> anyone want to speak up for the DataCite model with an explicit
> scheme (or whatever) attribute?  Have I perhaps overlooked an
> important reason why URI forms will give us headache?
>
> Cheers,
>
>          Markus
>
>
>
> [1] Incidentally: Does anyone want altIdentifier on other elements?
>



-- 
Dr. Alberto Accomazzi
Principal Investigator
NASA Astrophysics Data System - http://ads.harvard.edu
Harvard-Smithsonian Center for Astrophysics - http://www.cfa.harvard.edu
60 Garden St, MS 83, Cambridge, MA 02138, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/registry/attachments/20161205/b579242c/attachment.html>


More information about the registry mailing list