IVOA Identifiers Working Draft

Tony Linde ael at star.le.ac.uk
Fri Sep 12 10:42:24 PDT 2003


Hi Arnold,

Thanks for the example - it gives a good handle to the problem. I'll try to
answer it and the debate going on in the metadata list just now.

Referring to my previous email, Sa.CXO is a data collection and Sa.CXO/2900
is a dataset.

To resolve Sa.CXO/2900, ADS would look up Sa.CXO in a registry (assuming
Sa.CXO is a ShortName, perhaps?) to get the identifier of the data
collection and would then use that identifier to find sky services which
served that data.

It would then choose a relevant service (perhaps by picking the one with the
relationship of 'PrimarySource' to the data collection) and query the
registry for that service's metadata.

Now we have two options:

1. if having access to a portion of a data collection is a common need then
we can add a metadata element to the standard SkyService resource metadata
element which provides a method call and parameter details or a URL to which
the dataset identifier is appended

2. if this is not common, then ADS should access the WSDL for the service to
find out how to directly address the dataset.

Personally I favour 1 based on the heat in the metadata list :)

By using the registry, any problems of moving the data collection or any of
the sky services which serve it are resolved.

One potential problem:

> So, the identifier represents the 
> result of a query (or, if you prefer, you can consider it a 
> complete query in its own right) and it is not necessary that 
> it points directly to the bits, but allows the user to find 
> (and retrieve if public) such a dataset.

If, instead of a simple dataset, the data referred to in the paper really is
a query, how would that be stored?

We could store the results - and this is a facility which we are addressing
in AstroGrid with the MySpace concept, though how that data is made
published such that it is perpetually available is tricky.

Or we could store the ADQL/SQL of the query. But that leaves a potential
problem in that if the underlying data changes, the results of the query
will be different from that used by the researcher.

Any ideas anyone?

Cheers,
Tony. 

> -----Original Message-----
> From: Arnold Rots [mailto:arots at head-cfa.harvard.edu] 
> Sent: 10 September 2003 21:48
> To: Tony Linde
> Cc: Arnold Rots; Ray Plante; registry at ivoa.net
> Subject: Re: IVOA Identifiers Working Draft
> 
> 
> Tony,
> 
> Maybe the best thing is indeed an example to explain how it's 
> being used.  Simply, the requirement is that such an 
> identifier can be inserted in a paper and allow the readers 
> in perpetuity to find the observation dataset that was being 
> used for that paper.  So, the identifier represents the 
> result of a query (or, if you prefer, you can consider it a 
> complete query in its own right) and it is not necessary that 
> it points directly to the bits, but allows the user to find 
> (and retrieve if public) such a dataset.
> 
> If you used Chandra observations 2000 and 2900 in your paper, 
> you would include identifiers Sa.CXO/2000 and Sa.CXO/2900 The 
> client that uses these identifiers (the ADS) would then 1) 
> verify that these identifiers are valid and 2) harvest the 
> URLs where the (pointers to the) datasets can be found.
> 
> Currently, those would be:
> 
> http://cda.harvard.edu:9011/chaser/ocatList.do?obsid=2000
> http://cda.harvard.edu:9011/chaser/ocatList.do?obsid=2900
> 
> If the CXO archive would move to some other location, these 
> URLs would change but the identifier should remain valid.  
> I.e., Sa.CXO will be found at another physical resource (and 
> the registry had better be aware where it can be found), but 
> that new physical resource would be required to support all 
> resource keys that were previously defined by the previous 
> owner of the naming authority Sa.CXO.
> 
> All the metadata on observations 2000 and 2900 can be 
> retrieved from the Chandra observation catalog, but I see no 
> reason why all that information should be stored at the 
> top-level registries as well.  Or, alternatively, the 
> registry might know how to query for those metadata.
> 
> Hope this helps,
> 
>   - Arnold
> 
> Tony Linde wrote:
> > Hi Arnold,
> > 
> > > I come back to the compatibility with persistent identifiers for 
> > > literature linking and argue against making resource keys 
> mandatory.
> > 
> > I don't see how the two are incompatible. It is the 
> *combination* of 
> > AuthorityID and ResourceKey which identifies a resource and 
> there is 
> > nothing to stop this being persistent.
> > 
> > > The registry should only have the naming authority an be able to 
> > > translate that into a root URL, at which point any valid resource 
> > > key can be appended.
> > 
> > I certainly don't agree that a ResourceKey is constructed 
> at the point 
> > of a query if that is what you are saying. How can you save the 
> > structure of a workflow if none of the resources referred to have 
> > persistent identifiers. It also means that no-one can save the 
> > identifiers for favourite resources in order to reuse them. Come to 
> > think of it, if you don't store metadata for resources, how do you 
> > answer any queries on the registry?
> > 
> > Maybe we just understand the term 'resource' to mean 
> different things. 
> > What do you mean by it? Can you give some examples?
> > 
> > Cheers,
> > Tony.
> > 
> > 
> > On Wed, 10 Sep 2003 15:22:24 -0400 (EDT), "Arnold Rots" 
> > <arots at head-cfa.cfa.harvard.edu> said:
> > > I come back to the compatibility with persistent identifiers for 
> > > literature linking and argue against making resource keys 
> mandatory. 
> > > The registry should only have the naming authority an be able to 
> > > translate that into a root URL, at which point any valid resource 
> > > key can be appended.  It would be foolish to insist that all 
> > > resource keys at this level of granularity be contained in the 
> > > registry.
> > > 
> > >   - Arnold
> > > 
> > > Ray Plante wrote:
> > > > Hi Tony,
> > > > 
> > > > On Wed, 10 Sep 2003, Tony Linde wrote:
> > > > > A few comments on this wrt the sample registry based 
> on the new 
> > > > > schema (adil-v0.8.1.xml).
> > > > > 
> > > > > Neither resource in the sample (one Organisation and one 
> > > > > DataCollection) has a ResourceKey within their identifiers. I 
> > > > > think ResourceKey should be mandatory in all resources except 
> > > > > one which we should create for, say,
> > > > > Authority: this could hold any info about the 
> authority including a pointer
> > > > > to an organisation.
> > > > 
> > > > I went back and forth on this one.  (What I really needed was a 
> > > > second
> > > > opinion.)  I'll change this.
> > > > 
> > > > > The document also suggests that only people from a 'naming 
> > > > > authority' can add resources to a registry. In my mind, a 
> > > > > registry should have a default AuthorityID so that 
> anyone could 
> > > > > add a resource to it whether they are from a 
> recognised naming 
> > > > > authority or not.
> > > > > 
> > > > > A registry could be set up to refuse registrations from 
> > > > > non-authority personnel but this should not be the default, I 
> > > > > think.
> > > > 
> > > > Agreed.  I'll put a clarify remark in the WD.
> > > > 
> > > > cheers,
> > > > Ray
> > > > 
> > > > 
> > > 
> --------------------------------------------------------------
> ------------
> > > Arnold H. Rots                                Chandra 
> X-ray Science
> > > Center
> > > Smithsonian Astrophysical Observatory                tel: 
>  +1 617 496
> > > 7701
> > > 60 Garden Street, MS 67                              fax: 
>  +1 617 495
> > > 7356
> > > Cambridge, MA 02138                            
> > > arots at head-cfa.harvard.edu
> > > USA                                    
> > > http://hea-www.harvard.edu/~arots/
> > > 
> --------------------------------------------------------------------
> > > ------
> > > 
> > __
> > Tony Linde                       Phone:  +44 (0)116 223 1292
> > AstroGrid Project Manager        Fax:    +44 (0)116 252 3311
> > Dept of Physics & Astronomy      Mobile: +44 (0)7753 603356
> > University of Leicester          Email:  ael at star.le.ac.uk
> > Leicester, UK   LE1 7RH          Web:    http://www.astrogrid.org
> > 
> --------------------------------------------------------------
> ------------
> Arnold H. Rots                                Chandra X-ray 
> Science Center
> Smithsonian Astrophysical Observatory                tel:  +1 
> 617 496 7701
> 60 Garden Street, MS 67                              fax:  +1 
> 617 495 7356
> Cambridge, MA 02138                             
> arots at head-cfa.harvard.edu
> USA                                     
> http://hea-www.harvard.edu/~arots/
> 
> --------------------------------------------------------------
> ------------
> 




More information about the registry mailing list