towards an ID spec

Thu Feb 20 04:27:07 PST 2003

Hi Tony,

Thanks for the comments!

> First a question: why do we need to use a single string to identify
> registered services? How about:
> 
> <serviceID>
>   <authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
>   <serviceKey>xmm/catalogABC/datasetXYZ</serviceKey>
> </serviceID>

Good question.  I guess it's an issue of ease of compatability with
other contexts.  As a single string, there are places where it could
be inserted where an XML fragment (i.e. a <serviceID> node) could not,
e.g. a VOTable cell, as a reference in an XML attribute, or in some
other standard format like an OAI record (which requires a URI).
These may not make a strong argument as there may be simple ways
around them (e.g. in the VOTable case, one could store it in two
columns pretty easily) and thus perhaps would not outweigh the
advantages of delimiting the two clearly as you've shown.

> I'm quite averse to having any 'meaning' attached to the serviceID such
> as the first part being the address of the registry (what if LEDAS
> change to http://ledas.star.le.ac.uk - what happens to all its
> registered services?) and the stuff after the '/' being some other type
> of structure. If we want to store such information, we should create
> other fields in the metadata for it. 

I agree with this aversion, and I know my proposal walks a fine line.  
There were a number of details I left out, but perhaps I might share a few 
principles the proposal is based on: 

  *  In general, applications do not discern anything about a service
     based directly on the character content of its ID.  Applications
     must go to the metadata for this.  Just like URIs used in the XML
     world, it only implies a small number of things you can do with
     it (like use it as a reference).

  *  The only thing that one can definitively learn from an ID is
     where one can go to retrieve the metadata for the service.  (I
     think this is consistant with your <serviceId>.)

  *  The network name refers to the authority that issued the ID.
     Anyone can be their own authority.

  *  I did't intend the authority part to be part of (or imply) a URL
     base to antual registry service.  Rather, it was just an
     identifier (like in the RDF sense); to resolve it to an actual
     registry service, you would have to go to an IVOA-wide registry.
     However, if it is a URL base in some form, then you cut out a
     level of indirection.

> I don't see that 'ivoaid:' adds anything. If we use a serviceID within
> the VO then it will have been registered. 

So what does it mean to be "registered", in your mind?  Does this mean
that if you give the serviceKey to the authority it will return a
description?  Does this imply that anything that has a serviceKey must
have a retrievable description?  Or, alternatively, does the
authorityID simply serve as a namespace qualification and nothing is
implied about access to a description.

The use case I'm exploring here is where a data provider may want to
refer to an object with an ID but not bother formally registering a
description of it (because it's transient or too much work or
whatever).  This ID may be used as a handle for retrieving the object
itself (as opposed to its metadata).  This ID can still be useful as a
pointer to metadata if it can be used to get information about the
collection it is a part of.  If this functionality is not useful, then
much of the motivation for "ivoaid:" goes way.

One possible way to deal with IDs without description in your
serviceID model might be (ignoring my choice of element name):

 <serviceID>
   <authorityID>http://www.star.le.ac.uk/ledas/registry</authorityID>
   <serviceKey>xmm/catalogABC/datasetXYZ</serviceKey>
   <recordKey>1244+29.0_v100s.fits<recordKey>
 </serviceID>

where the serviceKey points to a metadata description but recordKey does
not.  Nothing else is implied except that there is some logical
relationship between the recordKey and the serviceKey; only by going to
the metadata can you discern that relationship.  This is the extent of
what I was trying to accomplish.

> The same applies to the 'mirrors' discussion. If you want to say that
> one service mirrors another, then just say so:

Yes, my intention was to determine a mirroring relationship via the
metadata.  The use case I'm exploring here is one brought to us by
journals, who would like a location independent identifier for data
cited in an article.  They would like to be able to resolve the ID to
a location or locations, even if that location changes.  The recipe I
proposed they use is:
   1.  submit the collection ID to a registry to get matching global
       IDs.
   2.  Verify that you have the desired data via the metadata

Your <serviceID> could be used in the same way, where <serviceKey> is
the location independent part. 

cheers,
Ray