towards an ID spec

Tony Linde ael at star.le.ac.uk
Wed Feb 19 13:29:23 PST 2003


Hi Ray,

(In the following I refer to everything within the registry as a service
whether a data source, software, etc. since everything must be
accessible via a grid/web service.)

First a question: why do we need to use a single string to identify
registered services? How about:

<serviceID>
  <authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
  <serviceKey>xmm/catalogABC/datasetXYZ</serviceKey>
</serviceID>

Or

<serviceID>
  <authorityID>http://casu.cam.ac.uk/registry</authorityID>
  <serviceKey>AE1F7FB0-A5F8-11D5-A30A-002035229C64</serviceKey>
</serviceID>

I think we need to stick with some form of URL format for the registry
authority but the service key within that could be entirely up to the
registry to determine.

I'm quite averse to having any 'meaning' attached to the serviceID such
as the first part being the address of the registry (what if LEDAS
change to http://ledas.star.le.ac.uk - what happens to all its
registered services?) and the stuff after the '/' being some other type
of structure. If we want to store such information, we should create
other fields in the metadata for it. 

I don't see that 'ivoaid:' adds anything. If we use a serviceID within
the VO then it will have been registered. 

If you want to enclose or include one type of service within another we
should add some form of relationship metadata, so a service could
include:

<relationships>
  <relationship type="include">
    <serviceID>
      <authorityID>http://casu.cam.ac.uk/registry</authorityID>
      <serviceKey>AE1F7FB0-A5F8-11D5-A30A-002035229C64</serviceKey>
    </serviceID>
  </relationship>
  <relationship type="derivedFrom">
    <serviceID>
      <authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
      <serviceKey>xmm/catalogABC/datasetXYZ</serviceKey>
    </serviceID>
  </relationship>
</relationships>

> Thus, you can always learn something about an ID.

But NOT from the ID itself - it will point to metadata - that is where
you learn about the service. An ID must be unique and never changing -
as soon as you ascribe meaning to any part of it then you hit problems
as soon as that meaning changes - and it will!

The same applies to the 'mirrors' discussion. If you want to say that
one service mirrors another, then just say so:

<relationships>
  <relationship type="mirrors">
    <serviceID>
      <authorityID>http://casu.cam.ac.uk/registry</authorityID>
      <serviceKey>AE1F7FB0-A5F8-11D5-A30A-002035229C64</serviceKey>
    </serviceID>
  </relationship>
</relationships>

There is no need to try to encode this in the ID.

Bottom line is that we should not have any type of encoding information
in the ID. This is something the XML community has learned in its
standards on namespaces. A namespace may look like a URL but it does not
resolve to a path within a server - it is completely meaningless - it
only acts as a unique unchanging identifier for that namespace.

Cheers,
Tony.

> -----Original Message-----
> From: Ray Plante [mailto:rplante at poplar.ncsa.uiuc.edu] 
> Sent: 19 February 2003 19:56
> To: registry at ivoa.net
> Subject: towards an ID spec
> 
> 
> Hi,
> 
> I'm working on a proposal for a specification of unique 
> identifiers for the IVOA.  (You can peek at my gory notes thus far at
> http://rai.ncsa.uiuc.edu/~rplante/VO/metadata/oidspec.txt.)  
> There are a few items in particular that I'm looking for 
> feedback on now.  These items will be a topic of discussion 
> at this week's NVO MWG telecon; however, broader feedback via 
> email would be very helpful.
> 
> The general gist of the spec is that any ID that conforms to 
> the IETF standard for URIs may be used as a global 
> identifier.  However, IDs that start with "ivoaid:" (or 
> whatever we want to call it) imply certain things--principly 
> that it has been registered in some form. Use of the "ivoaid" 
> scheme would impose additional requirements on the ID and 
> what you can do with it.  I also  attempt to incorporate 
> Arnold's ideas for addressing mirrors and location-independent names.
> 
> The first issue I'm wondering about is which URI form we want 
> to go with for "ivoaid" IDs.  We should probably go with one 
> of the two common forms of URI refered to in the standard 
> (http://www.ietf.org/rfc/rfc2396.txt)...
> 
>   1) URN syntax:    
>      e.g. urn:ncsa.uiuc.edu:ADIL:95.DR.01
> 
>        * a colon (:) is the primary delimiter 
>        * commonly used in the digial library world
>        * (we're not restricted to using "urn" as the leading scheme)
> 
>   2) a net-based form of the generic syntax:  
>      e.g.  ivoaid://ncsa.uiuc.edu/ADIL/95.DR.01
> 
>        * a slash (/) is the primary delimiter
>        * commonly used in the Web/XML world
> 
> Which do people prefer?  I am partial to the second one 
> myself (based on what I'm prosing to do with it); however, I 
> don't think it matters that much either way.  I'd like to 
> hear other people's opinions, particularly in light of the 
> 2nd issue below.
> 
> The 2nd issue concerns using an ID to retrieve descriptions 
> of things.  In general, I don't think it's a good idea to 
> require that all "ivoaid:" IDs have a registered, retrievable 
> description associated with it.  That is, you may want to 
> refer to an image in a collection with an "ivoaid:" ID (say, 
> in an SIA query result) but not bother to actually register 
> it explicitly.  This may be because:
> 
>    *  you've got too many images and it would be too much work
>    *  the image or its ID is not persistant
>    *  the collection contents is changing all the time.
> 
> Instead, we would simply require that at least one of its 
> enclosing collections be registered.  To make it possible to 
> learn about an ID, whether it is explicitly registered or 
> not, I propose that the authority that issues the ID support 
> a "Describe" service that works as follows:
> 
>    1. suppose I have an image ID of the form, 
>         ivoaid://ncsa.uiuc.edu/ADIL/95.DR.01.01
>    2. I give this ID to the service.  If that ID is registered
>         explicitly, its description is returned.
>    3. If that ID is not registered, the service looks for its
>         enclosing colletion, ivoaid://ncsa.uiuc.edu/ADIL.
>    4. The hierarchy is ascended until a description is found.  At a 
>         minimum, the top level, ivoaid://ncsa.uiuc.edu, must be
>         registered. 
> 
> Thus, you can always learn something about an ID.
> 
> My questions on this issue are:
>   o  Should we require that all "ivoaid:" IDs be explicitly
>      registered, or can we get away with just requiring registration,
>      at the least, of one of the enclosing collections?
>   o  Is the "fall-back" Describe service a good idea?
>   o  If so, it requires that / (or :, in the URN syntax) imply
>      containment.  Is this a problem?
>   o  Does the "fall-back" Describe functionality affect which URI form
>      we choose?
> 
> Now a 3rd issue (if you're still with me) is regarding 
> mirroring and data relocation.  Arnold proposed a 
> three-component ID of the form "L:P:D", where L=resource 
> location, P=project/service, and D=dataset (see 
> http://www.ivoa.net/forum/registry/0060.htm).  "L:P:D" points 
> to a specific instance of a dataset at a specific location.  
> "P:D" can be used as a location-independent ID for the 
> dataset which is resolvable to a location by querying a 
> registry for "P".  
> 
> I would propose folding this idea in in the following way.  
> Suppose SSDS hosts a collection with the ID, 
> "ivoaid://sdss.jhu.edu/SDSS/catalogs".
> And suppose STSci wants to mirror that collection.  It would 
> re-use the "SDSS/catalogs" part of the ID for its mirror 
> (that's the "P" part); it could register this as 
> "ivoaid://stsci.edu/mirrors/SDSS/catalogs".  Now suppose that 
> I want to access one of the items in this collection: 
> "SDSS/catalogs/extended" (that's the "P:D" part).  I would 
> resolve this to a list of locations using a "Match ID" 
> service of a registry. The registry would first look for all 
> IDs that end in
> "SDSS/catalogs/extended".   Since this is not registered, it won't
> find anything, so it ascends the ID and looks for IDs ending 
> in "SDSS/catalogs".  This would return both occurances: 
> "ivoaid://sdss.jhu.edu/SDSS/catalogs" and 
> "ivoaid://stsci.edu/mirrors/SDSS/catalogs". 
> 
> Note: just because 2 IDs share some portion does not by 
> itself indicate that they are mirrors.  To determine this 
> definitively, one would have to look at the metadata for the 
> two collections.  We can imagine specific metadata for 
> describing this.
> 
> Questions:
>   o  Is this a good framework for handling mirrors/data relocation
>   o  (Arnold:)  does this satisfy the requirements for
>      location-independent names (as needed by the journals)? 
> 
> I look forward to feedback.  We'll also talk about this at 
> this week's NVO MWG.
> 
> thanks,
> Ray
> 
> 



More information about the registry mailing list