IVO ids and Data Sets Identifiers

Alberto Accomazzi aaccomazzi at cfa.harvard.edu
Wed Sep 24 12:28:53 PDT 2003


Tony,

there is no reason why the dataset identifiers used by the journals 
*need* to be valid identifiers according to the IVO WD, but obviously it 
would be desireable to have as much consistency as possible between the 
two since we can assume that at some point in the near future IVO 
identifiers will become well-known at least to datacenter managers and 
developers, if not end-users.

The original plan concerning the verification and linking of these 
identifiers called for ADS to create and maintain the infrastructure 
enabling this to work.  We are still going to go ahead with that plan, 
but obviously we'd like things to be as VO-ready as possible so that one 
day we can tie these tools into the registry without any major effort.

So while there are a number of ways we can make this work by creating 
translation/lookup services even if there is no 100% syntactic agreement 
between what gets published in the journals and what becomes the IVO 
standard for identifiers, I think it would be a pity not to have an 
agreement since the time is right for this to happen.

I can personally see valid points on both sides of the issue, but for 
the sake of consistency and to allow us to enforce a minimum of 
structure on the records in the registry, I favour the syntax which 
includes a resource key, because otherwise I fear an unnecessary 
proliferation of authority ids, with all the administrative burden that 
goes with it.  In my mind an authority ID is the equivalent of a domain 
name and should be used sparingly.  The combination of authority ID and 
resorce name is the equivalent of a hostname, which can be manipulated 
at will by the authority ID in charge.


-- Alberto


Tony Linde wrote:
> I was just thinking - why does the journal id have to be the resource
> identifier. The journals can come up with their own scheme and manage it to
> ensure uniqueness, we add a JournalID to the resoure metadata then anyone
> can search a registry on JournalID=value to find the resource.
> 
> If also means that if a data source is duplicated somewhere with a different
> resource id then it will still be found in the registry search - user (or
> software) can then decide which one to use.
> 
> Will that solve the issue?
> 
> Cheers,
> Tony. 
> 
> 
>>-----Original Message-----
>>From: Arnold Rots [mailto:arots at head-cfa.harvard.edu] 
>>Sent: 24 September 2003 16:04
>>To: Tony Linde
>>Cc: 'Francois Ochsenbein'; registry at ivoa.net
>>Subject: Re: IVO ids and Data Sets Identifiers
>>
>>
>>I'm all in favor of getting something done, but we're not 
>>experimenting here - at least not where it concerns the 
>>identifiers used for the the literature links.
>>
>>In abstracto you are right.  But I agree with Francois that 
>>we ought to make the conversion easy for what I expect to be 
>>a common case - where an authority ID translates into a root URL.
>>
>>I may be dense, but what is the advantage of having multiple 
>>resources registered under the same authority - other than 
>>abstract elegance?
>>
>>If there is a need to define an authority as overarching 
>>several registered resources, there is in principle no reason 
>>to require that such linking be visible in the identifiers - 
>>especially not if it means a loss of convenience in common 
>>situations.  What counts in the identifier is the registered 
>>resource that it points to.
>>
>>  - Arnold
>>
>>Tony Linde wrote:
>>
>>>>My only problem is that the usage of the # will forbid the
>>>>usage of the same identifier as an IVO identifier AND as an 
>>>>UR[IL] directly usable to point to a dataset from an 
>>>
>>>Good. It should not be usable as a url. It's sole purpose is as an 
>>>identifier of a resource within the registry and it has no meaning 
>>>other than that.
>>>
>>>
>>>>I feel it's a pity to introduce this incompatiblity -- and
>>>>moreover I agree fully with Arnold, I cannot understand the 
>>>>necessity of a resource_key. What you need is just a unique 
>>>>identifier, which will be supplied by the Authority.ID . It's 
>>>>not the role of the registry to define this, IMHO.
>>>
>>>The registry won't be defining either. Users of a registry will be 
>>>able to register an authority id and then define the resource keys 
>>>under that authority. The registry will ensure that the 
>>
>>resource key 
>>
>>>chosen is unique within that authority.
>>>
>>>This has been discussed for months now. The telecon 
>>
>>yesterday agreed 
>>
>>>that the resource identifier WD should be submitted as a Proposed 
>>>Recommendation to allow projects to try it out, especially 
>>
>>during the 
>>
>>>Jan demo.
>>>
>>>If it turns out that people are only registering one resource under 
>>>each authority then we can revisit the situation. Let's get 
>>
>>something 
>>
>>>*done*.
>>>
>>>Cheers,
>>>Tony.
>>>
>>>
>>>>-----Original Message-----
>>>>From: owner-registry at eso.org [mailto:owner-registry at eso.org]
>>>>On Behalf Of Francois Ochsenbein
>>>>Sent: 24 September 2003 15:22
>>>>To: registry at ivoa.net
>>>>Subject: Re: IVO ids and Data Sets Identifiers 
>>>>
>>>>
>>>>
>>>>Tony,
>>>>
>>>>My only problem is that the usage of the # will forbid the
>>>>usage of the same identifier as an IVO identifier AND as an 
>>>>UR[IL] directly usable to point to a dataset from an 
>>>>electronic article. It therefore 
>>>>means that the 2 things have to be disjoined because it is 
>>>>not possible (in the HTTP sense) to make a distinction between
>>>>   //Authority.ID/(resource_key)#(dataset_1) and
>>>>   //Authority.ID/(resource_key)#(dataset_2)
>>>>
>>>>I feel it's a pity to introduce this incompatiblity -- and
>>>>moreover I agree fully with Arnold, I cannot understand the 
>>>>necessity of a resource_key. What you need is just a unique 
>>>>identifier, which will be supplied by the Authority.ID . It's 
>>>>not the role of the registry to define this, IMHO.
>>>>
>>>>Cheers, francois
>>>>
>>>>
>>>>>This is in reply to both Francois and Arnold.
>>>>>
>>>>>An AuthorityID is not a service and so will not have an
>>>>
>>>>invocation URL
>>>>
>>>>>associated with it. The AuthorityID is simply a way of grouping
>>>>>resources and allowing a registry to know that when it assigns a 
>>>>>ResourceKey, it is globally unique (since no other registry 
>>>>
>>>>can assign
>>>>
>>>>>a ResourceKey to that AuthorityID).
>>>>>
>>>>>So a service *must* have a ResourceKey as part of its unique
>>>>>identifier, as well as the AuthorityID.
>>>>>
>>>>>On the issue of the dataset identifier, there is no problem with:
>>>>> ivo://Authority.ID/(resource_key)#(dataset_identification)
>>>>>since this is not being sent to an http server. It will be
>>>>
>>>>interpreted
>>>>
>>>>>by some piece of software (which can interpret the 
>>
>>ivo:// protocol)
>>
>>>>>which will understand that everything before the '#' sign is 
>>>>
>>>>a resource
>>>>
>>>>>identifier and everything after it a dataset identifier. The
>>>>
>>>>software
>>>>
>>>>>will use the resource identifier to look up the service
>>>>
>>>>metadata, get
>>>>
>>>>>the service invocation method (web service or cgi or
>>>>
>>>>whatever) and then
>>>>
>>>>>call that service using the dataset identifier (as a
>>>>
>>>>parameter to the
>>>>
>>>>>web service or as the value part of a cgi '?name=value' pair).
>>>>>
>>>>>The ivo:// protocol is for the use of *VO software* only and
>>>>
>>>>not http
>>>>
>>>>>servers. They may use the same URI convention but they are
>>>>
>>>>completely
>>>>
>>>>>separate and different naming conventions.
>>>>>
>>>>>Cheers,
>>>>>Tony.
>>>>>
>>>>>
>>>>>>From: owner-registry at eso.org 
>>
>>[mailto:owner-registry at eso.org] On 
>>
>>>>>>Behalf Of Francois Ochsenbein
>>>>>>Sent: 23 September 2003 22:34
>>>>>>To: registry at ivoa.net
>>>>>>Subject: IVO ids and Data Sets Identifiers
>>>>>>
>>>>>>
>>>>>>If I may intervene on this subject of the IVO identifiers, I 
>>>>>>would strongly suggest not to use the hash symbol as a 
>>
>>delimiter, 
>>
>>>>>>because it has a special meaning in the HTTP
>>>>>>protocol: it indicates a marker in the document, and what
>>>>>>comes after the # symbol is quite generally ignored by the 
>>>>>>HTTP servers.  For instance the output of an HTTP GET to
>>>>>>  
>>
>>http://www.stsci.edu/ftp/science/hdfsouth/warnings.html#NICMOS
>>
>>>>>>is exactly identical to the output of
>>>>>>  http://www.stsci.edu/ftp/science/hdfsouth/warnings.html
>>>>>>(the browser simply scrolls the document to position it at 
>>>>>>the "NICMOS" marker)
>>>>>>
>>>>>>Therefore in the general scheme of the IVO identification
>>>>>>   
>>>>>>
>>
>>ivo://Authority.ID/(resource_key)(dataset/subset_identification)
>>
>>>>>>the # sign could maybe appear in the 3rd part
>>>>>>(dataset/subset_identification) but certainly not a separator 
>>>>>>between the parts 2 and 3 (and moreover not between the parts 
>>>>>>1 and 2) if it is wished that a resource can be reached using 
>>>>>>the http conventions.
>>>>>>
>>>>>>In fact, for datasets I would more agree with Guenther's
>>>>>>arguments: one Authority.ID, and what follows it is defined
>>>>>>under the responsability of the Authority.ID who "publishes" 
>>>>>>the way to access to the data. The dataset can effectively be 
>>>>>>a simple document like
>>>>>>   
>>>>
>>>>ivo://Authority.ID/Resource-Service/Repository/Intrument/datasetID
>>>>
>>>>>>but could also be something like
>>>>>>   
>>>>>>
>>>>
>>ivo://Authority.ID/RepositoryQuery?instrument=Intrument&ID=datasetID
>>
>>>>>>As long as there is no ambiguity (the Authority.ID may 
>>
>>"delegate" 
>>
>>>>>>the "naming space" to different missions or instruments or 
>>>>>>whatever) the role of the identifiers is fulfilled. 
>>
>>The problem 
>>
>>>>>>with the links from the journals is, as pointed out by 
>>
>>Bob, the 
>>
>>>>>>requirement for long-term persistence which can only 
>>
>>be achieved 
>>
>>>>>>if any change in the curator is
>>>>>>propagated to the Registry who has to make this mapping 
>>>>
>>>>between the
>>>>
>>>>>>'permanent' and the 'actual' identifiers (the GLU was not bad
>>>>>>in this role!) 
>>>>
>>>>==============================================================
>>>>==================
>>>>Francois Ochsenbein       ------       Observatoire 
>>>>Astronomique de Strasbourg
>>>>   11, rue de l'Universite F-67000 STRASBOURG       Phone: 
>>>>+33-(0)390 24 24 29
>>>>Email: francois at astro.u-strasbg.fr   (France)         Fax: 
>>>>+33-(0)390 24 24 32
>>>>==============================================================
>>>>==================
>>>>
>>>
>>--------------------------------------------------------------
>>------------
>>Arnold H. Rots                                Chandra X-ray 
>>Science Center
>>Smithsonian Astrophysical Observatory                tel:  +1 
>>617 496 7701
>>60 Garden Street, MS 67                              fax:  +1 
>>617 495 7356
>>Cambridge, MA 02138                             
>>arots at head-cfa.harvard.edu
>>USA                                     
>>http://hea-www.harvard.edu/~arots/
>>
>>--------------------------------------------------------------
>>------------
>>
> 
> 


-- 

****************************************************************************
Alberto Accomazzi
NASA Astrophysics Data System                     http://adswww.harvard.edu
Harvard-Smithsonian Center for Astrophysics      http://cfa-www.harvard.edu
60 Garden Street, MS 31, Cambridge, MA 02138 USA
****************************************************************************



More information about the registry mailing list