On Authorities...

Tony Linde ael at star.le.ac.uk
Thu May 29 11:48:27 PDT 2003


Hi Bob,

(message copied to registry list - can we continue there?)

> I don't think there is agreement on this point Tony.

True - but the registry = authority was the way Ray and I were putting the
ResourceID together so I was working from that.

The reason I think it should be this way is simply that it is simpler.
Otherwise who decides on who 'owns' an AuthorityID. Can I publish data with
AuthorityID = http://archive.stsci.edu/ if I want. And if two people
register a resource with the same ID at different registries, how is it
resolved when the data is harvested?

I really think the registry should be the authority for all the resources
registered there. 

Cheers,
Tony. 

> -----Original Message-----
> From: Robert Hanisch [mailto:hanisch at stsci.edu] 
> Sent: 29 May 2003 19:34
> To: ael at star.le.ac.uk; metadata at us-vo.org
> Subject: Re: On Authorities...
> 
> 
> Tony wrote:
> > Just to butt in here, the AuthorityID in the Resource ID is the 
> > identifier of the registry.
> 
> I don't think there is agreement on this point Tony.  Indeed, 
> it was hotly contested in our MWG telecon this morning, and I 
> believe the prevailing view was just the opposite.  The 
> publisher of the resource is responsible for this, and by 
> using URIs, which use IP domains, we implicitly have unique IDs.
> 
> I've spent the past hour or so poring over 
> http://www.ietf.org/rfc/rfc2396.txt, which, as you and others 
> have pointed out, is the definitive reference for URIs.  
> Originally we said that the RSM metadata element Identifier 
> was a URI.  Ditto for PublisherID and the more recently 
> proposed ServiceStandardID.  I don't think there is anything 
> that asserts that the "authority" component of the URI must 
> originate with the VO registry.
> 
> In RSM v0.7 we have an example for SDSS published by STScI:
> 
>     Identifier:        http://archive.stsci.edu/sdss/
>     PublisherID:    http://archive.stsci.edu/
> 
> which are completely legal URIs and where the authority 
> component, archive.stsci.edu, is asserted by the publisher 
> (STScI) and granted to STScI via domain name registration.
> 
> I urge everyone to keep all this as simple as possible, at 
> least in this first iterataion.  If we need more features 
> after an initial implementation we can always amend the agreements.
> 
> Cheers,
> Bob
> 
> 
> ----- Original Message -----
> From: "Tony Linde" <ael at star.le.ac.uk>
> To: <metadata at us-vo.org>
> Sent: Thursday, May 29, 2003 1:12 PM
> Subject: RE: On Authorities...
> 
> 
> > Hi Tom,
> >
> > Just to butt in here, the AuthorityID in the Resource ID is the 
> > identifier of the registry. If you choose to register a 
> resource, you 
> > register with a valid registry and the AuthorityID you get 
> is that of 
> > the registry. It is then up to the registry how it allocates the 
> > ResourceKey: it can allow the user to specify one which it 
> then checks 
> > for uniqueness within its own domain or it can simply 
> assign a random 
> > number.
> >
> > Of course, if you really want to own the AuthorityID all 
> you have to 
> > do is set up your own registry but then you have the hassle of 
> > maintaining it, making the records available for harvesting etc.
> >
> > Cheers,
> > Tony.
> >
> > > -----Original Message-----
> > > From: owner-metadata at us-vo.org 
> [mailto:owner-metadata at us-vo.org] On 
> > > Behalf Of Tom McGlynn
> > > Sent: 29 May 2003 17:58
> > > To: metadata at us-vo.org
> > > Subject: On Authorities...
> > >
> > >
> > > Hi Ray (and all),
> > >
> > > Just to follow up my comments on the telecon...
> > >
> > >
> > > The proposed Resource ID's has three levels: the authority, the 
> > > specific resource and the record id.  The record ID is matter of 
> > > some controversy, but that's unrelated to the current debate.
> > >
> > > A user may describe resources in their own registry (or 
> registries), 
> > > in a combination of their own registry and outside 
> registries, or in 
> > > one or more outside registries.  If the VO concepts work, 
> then all 
> > > of these are likely to be common situations -- things change with 
> > > time.
> > >
> > > One key concept -- and one that I didn't really put 
> properly in the 
> > > telecon -- is that the user is not only the author, but the 
> > > publisher of the resource. With the Web these two roles are often 
> > > conflated.
> > >
> > > E.g., if I'm working at the University or Iowa and I'm putting a 
> > > list of ultra-high energy events on the Web, then I'm publishing 
> > > that data on my Web site.  The registries that we have 
> been talking 
> > > about are the equivalent of library card catalogs -- they 
> are links 
> > > to the resources, but the registry does not 'publish' 
> resources -- 
> > > it disseminates information about resources that users have 
> > > published.
> > >
> > > The question is do we want the names/ids of these resources to be 
> > > meaningful? If not -- if they are to be opaque 
> identifiers -- then 
> > > we have no problems. We can just use running numbers.
> > >
> > > However if we wish the identifiers to have any meaning then the 
> > > person who can provide that is the author/publisher.  How 
> can they 
> > > provide this meaning if the ID's are not in their 
> control?  Distinct 
> > > but related resources may be published using different registries 
> > > (perhaps the original is out business, or the University has 
> > > recently started up its own registry).  How do we know 
> how to parse 
> > > any information that might be encoded in the resource 
> key?  We can 
> > > do this easily if the author/publisher is able to specify the 
> > > AuthorityID associated with his/her resources.  The responsibility
> > > belongs with the publisher (again I should be been clearer
> > > about the fact that with the web the author and the publisher
> > > are often the same).  Typically I would expect there to be a
> > > separate Authority for each data publisher and a default
> > > suggestion for the authority would be some the stable
> > > fraction of the host on which the resources are published.
> > >
> > > So, in my view, our example author/publisher
> > > creates an AuthorityID of astro.iowa.edu and then creates 
> a series 
> > > of resource keys, say  uhe/photonData/table, uhe/sources/table, 
> > > uhe/lightCurves/plot ... that are broadcast in various registries 
> > > over the course of several years.
> > >
> > > Note, by the by, how ISBNs look.  They have a distinct 
> publisher ID 
> > > and a separate running number for each publisher.  I 
> don't happen to 
> > > know if the publishers control the running numbers or 
> some central 
> > > authority -- but there is a separate 'name space' for 
> each publisher 
> > > -- not for each library catalog.
> > >
> > > Only if we have a person/institution that is willing to provide 
> > > uniform identifiers over all resources in astronomy can 
> we get away 
> > > from allowing the publishers to specify the IDs.  This 
> can be quite 
> > > valuable -- I use Dewey decimal, and Library of Congress 
> identifiers 
> > > for books a lot more than ISBNs -- but I don't believe we 
> have any 
> > > volunteers.
> > >
> > >
> > > Other points...
> > >
> > > Registries are not required to be persistent.  Thus there is no 
> > > guarantee that the registry an entry was originally entered in 
> > > exists any longer.  The URL pointing to that registry may be 
> > > invalid.  A user is not guaranteed a response from the publishing 
> > > registry.
> > >
> > > Registries should be designed to do their job or allowing 
> users to 
> > > find uresources  -- we shouldn't saddle them with 
> extraneous tasks 
> > > like being the arbiters of names.
> > >
> > > While we don't seem to be interested in using UDDI's, I'd 
> be curious 
> > > as to how they address this situation.  They use a distributed 
> > > registry system I think...
> > >
> > >
> > > Regards,
> > > Tom
> > >
> >
> >
> 



More information about the registry mailing list