Identifiers 2.0 WD

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Jan 13 15:45:24 CET 2015


Hi all,

On Mon, Jan 12, 2015 at 12:24:24PM +0100, Marco Molinaro wrote:
> here follow a few comments/questions

Thanks for these.  Warning: this is going to be somewhat long.
Relevant volute commits: 2806, 2807

> I'd really like the EBNF grammar in one place,
> probably leaving the current excerpts as are,
> but putting the whole grammar in an appendix.

Whoa.  This has been a rabbit hole.  I've added the appendix with the
merged grammar, but while doing that I noticed that the existing BNF
hasn't mentioned hashes at all, and it didn't allow ? in IVORNs
either, so technically quote a few of the identifiers that have been
around haven't been IVORNs at all, or at least they didn't match the
BNF.

Of course, you could say "who cares, so they're something else", but
given the ivo:// scheme they do have such splitting of hairs would
IMHO suck.

So I started to change the legacy content, which I originally tried
to avoid.  But then I noticed that we really have two cases here:

(a) IVORNs without # or ? reference resource records
(b) IVORNs with # or ? reference something else.

I started to like things.  So, I added  the following language:

  As in URLs, the implied semantics is that an IVORN with an
  octothorpe references a fragment within the resource record itself,
  whereas an IVORN with a question mark references an entity in some
  sense accessible through or related to the resource.

  Note that IVORNs with any part starting with a stop character
  stripped off MUST reference a resource record, i.e., resolve in
  IVOA registries to a VORe- source document. IVORNs with stop
  characters can reference anything. One example for each of the two
  mechanisms are given in section 6.

I can understand if people start protesting here.  Think about it for
a while; I'll defer submitting Identifiers to the doc repo for a bit
longer after such fairly intrusive changes.

The EBNF now looks like this:


<alphanum> ::= <ALPHA>  |  <DIGIT>
<reserved> ::= "?"   |   "#"   |   ";"   |   ":"   |   "@"   |   "!"   |  "&"   |  "$"   |   ","
<mark> ::= "-"   |   "_"   |   "."
<discouraged> ::= "~"   |   "*"   |   "'"   |   "("   |   ")"
<unreserved> ::= <alphanum>   |  <mark>   |   <discouraged>
<authority-id> ::= <alphanum> <unreserved> <unreserved> { <unreserved> }
<resourcekey> ::= <segment> {  "/" <segment> }
<segment> ::= { <unreserved> }
<ivo-scheme> ::= ( "i"   |   "I")( "v"   |   "V") ( "o"   |   "O")
<base-ivorn> ::= <ivo-scheme> "://" <authority-id> [ "/" <resourcekey>]
<stop-char> ::= "?"  |   "#"
<uri-char> ::= <unreserved>   |  <reserved>
<local-part> ::= { <uri-char> }
<gen-ivorn> ::= <base-ivorn> <stop-char> <local-part>

-- so there's the new nonterminal gen-ivorn ("generic IVORN"), and
the "local part" (a.k.a. query part or fragment identifier) lets you
use both reserved and unreserved characters.

I've not really thought this through deeply, nor have I checked what
other text in Identifiers should be changed in consequence.  I
thought I'd first ask around here if there are people severely
disliking these terms and conditions.


> In Section 4, you nicely describe musts and
> shoulds on unique identifiers and the re-use vs. update
> of an identifier.
> Publisher discretion is however a bit risky, in my opinion.
> What if I choose to re-use an IVORN that was an SCS
> to change it into a TAP? Wouldn't this confuse applications?
> Or wouldn't this be a bit tricky in an incremental
> harvest for a RegTAP?

Well, if our registry system worked perfectly, it wouldn't be an
issue, as the SCS record and its capabilities would disappear
quickly when either the deleted record or the TAP record come along. 
SCS clients wouldn't even know there has been something (but of
course, they might have their caches).

Given that our infrastructure isn't perfect, there's going to be some
confusing time while some registries will have been updated and
others not.  But I don't see how this could be worse than if the
original SCS records just slowly disppears.

On the other hand, for schemes with persistent identifiers re-use
certainly is a disaster, and so...

> I think that some strong suggestion can be inserted
> in this specification to avoid too loose directions to
> data providers.

...I'll gladly take suggestions (and of course you're welcome to
directly change the document).  I've given this a bit of thought, and
I can't come up with useful, operationalisable rules, simply because
"resource identity" is such a difficult term.  This kind of identity
isn't even necessarily transitive (think: I start out with a simple
service and then extend it again and again until it's the perfect
all-protocol service.  No step was large enough to justify changing
the identifier, but the original and final services have practically
nothing in common).  It's a mess, but I'm grateful for any sorting
out anyone contributes.

> The draft generally moves to IVORN to abbreviate
> IVOA Identifiers.
> IVOA Identifiers are U-RI and we call them IVO-RN, but
> are not U-RN (nor U-RL, as specified correctly).
> Ok...I'm an hairsplitter...drop this, I just payed attention
> to this details while reading on...

I think I've said that before today: I'd have preferred IVORI, too
(it also has a pleasant Italian sound to my ears:-), but I guess it's
too late to change things now.

> I think that (Section 6)
> "It is non-normative in the sense that what these
> identifiers reference is governed by other standards,"
> is a bit confusing considering the section _is_ normative,
> maybe the sentence can be turned around, stating it
> is normative for syntax but not for identifier reference.

You're right.  I've changed the introduction to sect 6.

> Also, is Section 7 or should it be Section 6.2, the one
> on Standard Identifiers?

Right again, thanks.

> Should it be stated explicitly that the Standard Identifiers
> Authority is the ivo://ivoa.net? And the resource key starts
> with and "std"?

No, I don't think so.  We could mention that this is how you can tell
a standard is officially endorsed, but (so far) there's nothing that
keeps anyone from putting up their own standards records (I've done
that, in fact, for my TAP examples).

> PS - ...plus some typos

Thanks for these, too; should all be fixed (except for the NVO
requirements document link).

Thanks,

         Markus



More information about the registry mailing list