On Authorities...

Thu May 29 13:23:09 PDT 2003

Hi Tony,

Must be getting late over there...

Maybe the fundamental question that hasn't been answered
is do humans see Resource IDs or only computers.  If humans see
these ID's then some organization along meaningful lines is needed.
Certainly I don't know the numeric value of any of the host machines
that I use, but probably I use hundreds of
distinct dsn's that I remember when surfing the Web.

However if these ID's are used only by computers talking to
computers than some running serial number is fine.   I think
there are still issues with associating these serial numbers
with registries, but it's not so big a deal.

My own feeling is that we (i.e., protoplasmic entities)
are going to have to use IDs at some
level...  The descriptions are not guaranteed to be unique -- and indeed
with mirrors and the like there are often going to be services
where almost all the meta-data is duplicated.

It may indeed be true that putting information into ID's limits
there expandibility, but not putting it in may limit their utility.
Many sets of ID's which encode meaning have done pretty well.
Internet host names are an obvious example.  US Social Security
Numbers encode the state (US) of the application and have done
OK for 60 years.  The ISBN's encode the publisher and possibly
book info for each publisher.  Bibcodes are probably the best example
of a successful identifier in astronomy and they encode a vast amount
of information in a very short space.

And now for something completely different...
When building applications for the Palm operating system, each application
needs to have a unique 4 character identifier used to tie the elements
in memory that belong to the program together.  Palm maintains a database
of such IDs.  Most programs seem to have managed to get something reasonable
for this id.  The problem of identifying resources in astronomy probably
isn't any harder.  Maybe we should take an more expansive approach and require
users to specify a unique, say, 16 character ID for each new service.  This
would guarantee that the ID's were nice and compact, but probably big
enough to enable many of us to build very mnemonic ids.  E.g.,

    st.hst.arch
    hsrc.rosat.tmln

or in my example from earlier

    iowa.uhe.evt

Now we have identifiers which are short and sweet so maybe they satisfy Roy,
but much easier to remember and type in than a random number.
Since we have a least a few universal registries, checking for ID's when adding
in a new service is straightforward -- a user could enter any unused string.
This isn't quite as automated as if
we just assign some new number when we register a service, nor is it as flexible
as if the user has an unlimited string they can specify, but I wouldn't
be surprised if it were more useful than either by combining brevity and
meaning.

The overarching thrust of my comments is that we should use registries as
services where users look for other services.  There probably should be some
kind of unique ID field included in the registry (though I don't think
it's absolutely mandatory that there be such).  However the role of the registry and
registry services is to enable users to search metadata for services of
interest, not to manage this ID field.  If we want to build a ID management
service that's great -- maybe it will even use the registries-- but define
its requirements separately and leave  it up to developers as to whether they want
to combine these two functions into one service.

	Tom

Tony Linde wrote:
> Hi Tom,
> 
> (message copied to registry list - can we continue there?)
> 
> 
>>registries.  I feel that's a confusion of functionality that 
>>makes any semantically meaningful naming scheme difficult to 
>>implement properly.  If names are opaque tokens then I have 
> 
> 
> That is the key to the argument. An ID should *not* have any semantic
> meaning. All meaning should be in the rest of the metadata. I know we have
> put the AuthorityID into the ResourceID but that is simply to ensure
> uniqueness when we have no central registration authority. But basically,
> the ID must be meaningless. It is only a pointer. 
> 
>>From long experience, embedding any meaning into an ID is fatal for the
> future expansion of systems.
> 
> Cheers,
> Tony. 
> 
> 
>>-----Original Message-----
>>From: Tom McGlynn [mailto:Thomas.A.McGlynn at nasa.gov] 
>>Sent: 29 May 2003 19:39
>>To: ael at star.le.ac.uk
>>Cc: metadata at us-vo.org
>>Subject: Re: On Authorities...
>>
>>
>>
>>
>>Tony Linde wrote:
>>
>>>Hi Tom,
>>>
>>>Just to butt in here, the AuthorityID in the Resource ID is the 
>>>identifier of the registry. If you choose to register a 
>>
>>resource, you 
>>
>>>register with a valid registry and the AuthorityID you get 
>>
>>is that of 
>>
>>>the registry. It is then up to the registry how it allocates the 
>>>ResourceKey: it can allow the user to specify one which it 
>>
>>then checks 
>>
>>>for uniqueness within its own domain or it can simply 
>>
>>assign a random 
>>
>>>number.
>>>
>>
>>That was in fact the crux of the discussion at this week's 
>>MWG telecon: Should the authority ID be associated with the 
>>identity of the registry? There was some considerable debate 
>>on that point. Doubtless this has come up in earlier 
>>discussions, but I had not picked up on the idea that naming 
>>authorities were tied one-to-one with actual realized 
>>registries.  I feel that's a confusion of functionality that 
>>makes any semantically meaningful naming scheme difficult to 
>>implement properly.  If names are opaque tokens then I have 
>>fewer problems, but then there is no real need for distinct 
>>Authorities and Keys.
>>
>>My feeling is that names need to be under the control of
>>the service publisher not the registries.  We should not 
>>require users to build a registry solely to be able to 
>>control their own namespace.
>>
>>Were we to continue down
>>this path, I would expect someone (quite possibly me) to 
>>build 'virtual' registry services which emulate a registry on 
>>behalf of a user who doesn't want to build an explicit 
>>registry, but does want to control their name space.
>>
>>Analogously my personal e-mail address is tom at mcglynns.org -- 
>>a name that I control. That used to point to a real mail 
>>account at mcglynn at home.net.  Alas @Home went bankrupt and 
>>any mail sent to that real address -- the address associated 
>>with a physical mail server -- is now lost to me.  Mail sent 
>>to the address at the name that I control, is properly 
>>forwarded to my current account mcglynn at comcast.net.
>>
>>We should not tie persistent names to potentially transient services.
>>
>>	Regards,
>>	Tom
>>
> 
> 
>