resource identifiers

Ray Plante rplante at poplar.ncsa.uiuc.edu
Fri May 23 10:47:31 PDT 2003


Hi Guys,

Just so we on the same page, the URI standard we are referencing is 
Berners-Lee et al. 1998, "Uniform Resource Identifiers (URI): Generic 
Syntax", IETF RFC 2396, http://www.ietf.org/rfc/rfc2396.txt

Please also note that the OAI specification simply says that identifiers 
must be URIs, full stop.  In practice, ?, =, and & do not need to be 
encoded over the wire as URLs (spaces are usually the problem).  The 
trouble Roy sees is when the URI is encoded as such within an XML 
document; the & must be encoded as &.

(Roy - I don't see encoding & as & as a problem.  The XSL stylesheet 
we use to convert our VO-XMLized identifier into a URI form can take care 
of this easily.)

I like Tony's proposal.  I was beginning to dislike the resourceKey= bit 
because it suggested that the order of the two arguments was not 
important.  If this were true, it would be harder to match URIs.  

Combining the AuthorityID and the ResourceKey with a slash is closer to 
what I had in mind originally; however, we do lose one capability: we 
cannot convert it back to XML because, in general, we don't know where the 
authorityID part ends.  

Note that in rfc2396, section 3 where it describes the pattern, 

   <scheme>://<authority><path>?<query>

the authority is described has being a simple name without slashes (sect. 
3.2).  With this pattern, it is possible to parse the authority separate 
from the path.  

Nevertheless, if we want the AuthorityID portion to allow slashes, I think
the ? and & are sufficient for delimiting the components.  We would say
that the first argument after the ? (i.e. before the first &) is the
ResourceKey.  Any following arguments (after the first & and delimited by
&'s) are components of the RecordKey (the Query Tony refers to).

  Question: How should we delimit the AuthorityID from the ResourceKey?
    A.  Conform to the above pattern: disallow slashes from the 
          AuthorityID
    B.  Use ? as the delimiter.  
    C.  Use some other delimiter (e.g. :)

I like breaking up the RecordKey into components.  My only concerns (which 
I consider minor) relate to how they are rendered in XML.  They are:
  *  Query is a bit of a loaded term; I would prefer RecordKey or 
       RecordKeys
  *  It is not possible (I believe) to allow either an element (Query) to 
       have either a simple type content (string) or a complex type 
       (<p1>...), apart from specifying the free-wheeling "any" type.  
  *  It is not possible to define elements like <p#> where # can be 
       unbounded.

Instead, I would recommend some variation on the following:

 <ResourceID>
   <AuthorityID>ivo://www.ncsa.uiuc.edu/nvo/registry</AuthorityID>
   <ResourceKey>ADIL/SIA/targeted</ResourceKey>
   <RecordKey>95.DR.01.01.fits</RecordKey>
   <RecordKey>wibble</RecordKey>
 </ResourceID>
  
cheers,
Ray




More information about the registry mailing list