Identifiers 2.0 new internal WD

Norman Gray norman at astro.gla.ac.uk
Tue May 12 18:49:52 CEST 2015


Markus, hello.

> On 2015 Apr 29, at 14:20, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> wrote:
> 
> http://docs.g-vo.org/Identifiers.pdf
> 
> Despite the massive changes, the specification content has not
> changed much.  To get a quick idea of what this is about, please have
> a look at section 1.3 ("Rationale for Version 2") and the entire
> section 4 (and I'd really appreciate if you could find that time for
> that).
> 
> Also, since that section 2 is almost all-new, I would be especially
> grateful if standards lawyers could have a closer look at that (I'm
> trying hard to not stare towards Glasgow too obtrusively now).

Hint taken.

I'm reading revision 2926.

----

p4, just before Sect. 1.1: 'this standard sets the parameters left open defined for application use'.  It looks as if this sentence has become a little garbled.  Also, it would be useful (for more legalistic readers such as me) to highlight which 'implementation-defined' options are being defined.  Do you mean simply 'we set the <scheme> to be "ivo:" '?  In that case, I think this sentence looks more heavyweight than intended, and might be simply omitted, or replaced by a reference to Sect. 2.

Sect. 2.2: 'Where IVOA standards require their decoding, they MUST prescribe UTF-8 encoding.' This might be better phrased the other way round: 'When specifications or applications require text to be percent-encoded within an IVOID, the text MUST be encoded in UTF-8.' (ie, phrase it in terms of encoding rather than decoding).

RFC 3986 permits parts of /path to be percent-encoded.  This spec permits such encoding in the local part, and by implication forbids encoding in the /path.  It might be worth making this proscription explicit.  Ah: I see this is stated within Sects. 2.3.2 and 2.3.3, but it might be worth duplicating here, so that the 

If it's not _necessary_ to permit percent-encoding, it might be worth forbidding them -- this avoids worrying about edge cases such as decoding '?10%2521', which decodes to '?10%21', which _doesn't_ decode to '?10!' (one must percent-decode at most once).

Sect. 2.3.3: you quite rightfully say that 'Naming authorities are discouraged from creating segments matching either “.” or “..”. Empty segments, resulting in two or more consecutive slashes or a trailing slash, are also discouraged.' Should this be perhaps SHOULD NOT.  In fact, is there a real need for this to be other than 'MUST NOT'?

Sect 2.3.3: 'VO applications MUST be case-insensitive when handling resource keys.' The word 'handling' is a little vague, it seems to me.  How about 'All processing of the IVORN <authority> and <path> MUST be case insensitive but case-preserving.'  Are applications required to recognise 'IVO:' as an IVOID scheme? (I think the answer is yes, by RFC 3986)  Are they obliged to serialise it as 'ivo:' (ie, not be case-preserving) (I think the answer is yes, by a principle of minimising surprises).

In fact, since Sect. 2.3.4 says that applications mustn't change Query case, you could decide to apply this to the Fragment, too, and say at the top of Sect. 2 that 'the whole IVOID (apart from the scheme) is case-preserving, and the IVORN is case-insensitive'.  Then the complete rule is in one place.

The detail in these rules indicates that a validator, with a big set of test cases, would be a useful thing to have.  I imagine a small Java or Python program would suffice.

Sect. 3: 'that is, IVORNs should not be reused.' This rules out an IVORN which refers to 'today's weather', unless you decide that 'today's weather' is a single logical resource even thought the referent -- the data it is referring to, as opposed to its description -- changes from day to day.  Is that intentional?  I think such an IVORN _should_ be permitted, by the way.

----

Typos:

p6.: 'Standard- sRegExt' is unfortunately hyphenated, and 'Standards\-RegExt' might be better (rewriting the sentence so that it's not hyphenated would probably be better still).  Same in 'Ob-sCore' in Sect. 4.1.

Trivial: Sect. 2.4: you need a sentence-ending full stop after 'etc.', or (better) a semicolon.  'thei' -> 'the'

Sect. 2.5: 'compontent' -> 'component'

Para starting 'holds, whereas a client' on p.16.  I think there's an extra blank line in the LaTeX, which causes the erroneous para indent.

Sect. 4.2: this section, but no other, uses small-caps for various words.  Are these consistently indicating XML elements?  If so, it might be better to write them as \texttt{<CAPABILITY>} or some such, or at least to explain the formatting.

That's all the glitches I can find -- it looks like a really good document!

Enjoy yourselves in the Dolomites, you lucky lot.

All the best,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK



More information about the registry mailing list