towards an ID spec

Thu Feb 20 00:41:27 PST 2003

I'll start arguing with myself...

> (In the following I refer to everything within the registry 
> as a service whether a data source, software, etc. since 
> everything must be accessible via a grid/web service.)

But this does not mean every current service (eg cgi-based services)
must convert or implement wrapper web services. We could set up some
'routing' web services who's job is to convert a web service call to a
cgi-based (or whatever) one. Then the cgi-based service registers
itself, fills in the appropriate metadata, and puts the url of its
service as the routing service with first parameter being their own cgi
url.

> <serviceID>
>   <authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
...

And this does not mean it has to be held as xml. I just used xml as it
is a good self-describing structure.

> I think we need to stick with some form of URL format for the 
> registry authority 

But only in the same sense as namespaces - ie, it is not the actual url
of the registry but is the serviceID (with null serviceKey) of that
registry. The url will be a separate piece of metadata.

Cheers,
Tony. 

> -----Original Message-----
> From: Tony Linde [mailto:ael at star.le.ac.uk] 
> Sent: 19 February 2003 21:29
> To: 'Ray Plante'; registry at ivoa.net
> Subject: RE: towards an ID spec
> 
> 
> Hi Ray,
> 
> (In the following I refer to everything within the registry 
> as a service whether a data source, software, etc. since 
> everything must be accessible via a grid/web service.)
> 
> First a question: why do we need to use a single string to 
> identify registered services? How about:
> 
> <serviceID>
>   <authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
>   <serviceKey>xmm/catalogABC/datasetXYZ</serviceKey>
> </serviceID>
> 
> Or
> 
> <serviceID>
>   <authorityID>http://casu.cam.ac.uk/registry</authorityID>
>   <serviceKey>AE1F7FB0-A5F8-11D5-A30A-002035229C64</serviceKey>
> </serviceID>
> 
> I think we need to stick with some form of URL format for the 
> registry authority but the service key within that could be 
> entirely up to the registry to determine.
> 
> I'm quite averse to having any 'meaning' attached to the 
> serviceID such as the first part being the address of the 
> registry (what if LEDAS change to http://ledas.star.le.ac.uk 
> - what happens to all its registered services?) and the stuff 
> after the '/' being some other type of structure. If we want 
> to store such information, we should create other fields in 
> the metadata for it. 
> 
> I don't see that 'ivoaid:' adds anything. If we use a 
> serviceID within the VO then it will have been registered. 
> 
> If you want to enclose or include one type of service within 
> another we should add some form of relationship metadata, so 
> a service could
> include:
> 
> <relationships>
>   <relationship type="include">
>     <serviceID>
>       <authorityID>http://casu.cam.ac.uk/registry</authorityID>
>       <serviceKey>AE1F7FB0-A5F8-11D5-A30A-002035229C64</serviceKey>
>     </serviceID>
>   </relationship>
>   <relationship type="derivedFrom">
>     <serviceID>
>
<authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
>       <serviceKey>xmm/catalogABC/datasetXYZ</serviceKey>
>     </serviceID>
>   </relationship>
> </relationships>
> 
> > Thus, you can always learn something about an ID.
> 
> But NOT from the ID itself - it will point to metadata - that 
> is where you learn about the service. An ID must be unique 
> and never changing - as soon as you ascribe meaning to any 
> part of it then you hit problems as soon as that meaning 
> changes - and it will!
> 
> The same applies to the 'mirrors' discussion. If you want to 
> say that one service mirrors another, then just say so:
> 
> <relationships>
>   <relationship type="mirrors">
>     <serviceID>
>       <authorityID>http://casu.cam.ac.uk/registry</authorityID>
>       <serviceKey>AE1F7FB0-A5F8-11D5-A30A-002035229C64</serviceKey>
>     </serviceID>
>   </relationship>
> </relationships>
> 
> There is no need to try to encode this in the ID.
> 
> Bottom line is that we should not have any type of encoding 
> information in the ID. This is something the XML community 
> has learned in its standards on namespaces. A namespace may 
> look like a URL but it does not resolve to a path within a 
> server - it is completely meaningless - it only acts as a 
> unique unchanging identifier for that namespace.
> 
> Cheers,
> Tony.
> 
> > -----Original Message-----
> > From: Ray Plante [mailto:rplante at poplar.ncsa.uiuc.edu]
> > Sent: 19 February 2003 19:56
> > To: registry at ivoa.net
> > Subject: towards an ID spec
> > 
> > 
> > Hi,
> > 
> > I'm working on a proposal for a specification of unique
> > identifiers for the IVOA.  (You can peek at my gory notes 
> thus far at
> > http://rai.ncsa.uiuc.edu/~rplante/VO/metadata/oidspec.txt.)  
> > There are a few items in particular that I'm looking for 
> > feedback on now.  These items will be a topic of discussion 
> > at this week's NVO MWG telecon; however, broader feedback via 
> > email would be very helpful.
> > 
> > The general gist of the spec is that any ID that conforms to
> > the IETF standard for URIs may be used as a global 
> > identifier.  However, IDs that start with "ivoaid:" (or 
> > whatever we want to call it) imply certain things--principly 
> > that it has been registered in some form. Use of the "ivoaid" 
> > scheme would impose additional requirements on the ID and 
> > what you can do with it.  I also  attempt to incorporate 
> > Arnold's ideas for addressing mirrors and 
> location-independent names.
> > 
> > The first issue I'm wondering about is which URI form we want
> > to go with for "ivoaid" IDs.  We should probably go with one 
> > of the two common forms of URI refered to in the standard 
> > (http://www.ietf.org/rfc/rfc2396.txt)...
> > 
> >   1) URN syntax:    
> >      e.g. urn:ncsa.uiuc.edu:ADIL:95.DR.01
> > 
> >        * a colon (:) is the primary delimiter 
> >        * commonly used in the digial library world
> >        * (we're not restricted to using "urn" as the leading scheme)
> > 
> >   2) a net-based form of the generic syntax:  
> >      e.g.  ivoaid://ncsa.uiuc.edu/ADIL/95.DR.01
> > 
> >        * a slash (/) is the primary delimiter
> >        * commonly used in the Web/XML world
> > 
> > Which do people prefer?  I am partial to the second one
> > myself (based on what I'm prosing to do with it); however, I 
> > don't think it matters that much either way.  I'd like to 
> > hear other people's opinions, particularly in light of the 
> > 2nd issue below.
> > 
> > The 2nd issue concerns using an ID to retrieve descriptions
> > of things.  In general, I don't think it's a good idea to 
> > require that all "ivoaid:" IDs have a registered, retrievable 
> > description associated with it.  That is, you may want to 
> > refer to an image in a collection with an "ivoaid:" ID (say, 
> > in an SIA query result) but not bother to actually register 
> > it explicitly.  This may be because:
> > 
> >    *  you've got too many images and it would be too much work
> >    *  the image or its ID is not persistant
> >    *  the collection contents is changing all the time.
> > 
> > Instead, we would simply require that at least one of its
> > enclosing collections be registered.  To make it possible to 
> > learn about an ID, whether it is explicitly registered or 
> > not, I propose that the authority that issues the ID support 
> > a "Describe" service that works as follows:
> > 
> >    1. suppose I have an image ID of the form, 
> >         ivoaid://ncsa.uiuc.edu/ADIL/95.DR.01.01
> >    2. I give this ID to the service.  If that ID is registered
> >         explicitly, its description is returned.
> >    3. If that ID is not registered, the service looks for its
> >         enclosing colletion, ivoaid://ncsa.uiuc.edu/ADIL.
> >    4. The hierarchy is ascended until a description is found.  At a 
> >         minimum, the top level, ivoaid://ncsa.uiuc.edu, must be
> >         registered.
> > 
> > Thus, you can always learn something about an ID.
> > 
> > My questions on this issue are:
> >   o  Should we require that all "ivoaid:" IDs be explicitly
> >      registered, or can we get away with just requiring 
> registration,
> >      at the least, of one of the enclosing collections?
> >   o  Is the "fall-back" Describe service a good idea?
> >   o  If so, it requires that / (or :, in the URN syntax) imply
> >      containment.  Is this a problem?
> >   o  Does the "fall-back" Describe functionality affect 
> which URI form
> >      we choose?
> > 
> > Now a 3rd issue (if you're still with me) is regarding
> > mirroring and data relocation.  Arnold proposed a 
> > three-component ID of the form "L:P:D", where L=resource 
> > location, P=project/service, and D=dataset (see 
> > http://www.ivoa.net/forum/registry/0060.htm).  "L:P:D" points 
> > to a specific instance of a dataset at a specific location.  
> > "P:D" can be used as a location-independent ID for the 
> > dataset which is resolvable to a location by querying a 
> > registry for "P".  
> > 
> > I would propose folding this idea in in the following way.
> > Suppose SSDS hosts a collection with the ID, 
> > "ivoaid://sdss.jhu.edu/SDSS/catalogs".
> > And suppose STSci wants to mirror that collection.  It would 
> > re-use the "SDSS/catalogs" part of the ID for its mirror 
> > (that's the "P" part); it could register this as 
> > "ivoaid://stsci.edu/mirrors/SDSS/catalogs".  Now suppose that 
> > I want to access one of the items in this collection: 
> > "SDSS/catalogs/extended" (that's the "P:D" part).  I would 
> > resolve this to a list of locations using a "Match ID" 
> > service of a registry. The registry would first look for all 
> > IDs that end in
> > "SDSS/catalogs/extended".   Since this is not registered, it won't
> > find anything, so it ascends the ID and looks for IDs ending 
> > in "SDSS/catalogs".  This would return both occurances: 
> > "ivoaid://sdss.jhu.edu/SDSS/catalogs" and 
> > "ivoaid://stsci.edu/mirrors/SDSS/catalogs". 
> > 
> > Note: just because 2 IDs share some portion does not by
> > itself indicate that they are mirrors.  To determine this 
> > definitively, one would have to look at the metadata for the 
> > two collections.  We can imagine specific metadata for 
> > describing this.
> > 
> > Questions:
> >   o  Is this a good framework for handling mirrors/data relocation
> >   o  (Arnold:)  does this satisfy the requirements for
> >      location-independent names (as needed by the journals)?
> > 
> > I look forward to feedback.  We'll also talk about this at
> > this week's NVO MWG.
> > 
> > thanks,
> > Ray
> > 
> > 
>