amendments to Identifier framework
Ray Plante
rplante at poplar.ncsa.uiuc.edu
Mon Sep 15 10:07:32 PDT 2003
Hi Tony,
Thanks for your comments. I'm happy to hear more from others.
One general comment: The suggests I called "amendments" are all
intended to be specific amendments to the Identifier WD. I tried to
look at the overall Identifier and Registry framework to understand
how we might address the issues that have been under discussion. Much
of whatever comes out of this discussion may appear only as an
appendix to illustrate how identifiers are intended to fit into the
overall framework (and thus are not a binding part of the spec).
On Mon, 15 Sep 2003, Tony Linde wrote:
> Part A
> ======
>
> AuthorityIDs are not necessarily owned by organisations. This should be an
> optional part of the authority metadata.
I think we need to be able to trace an AuthorityID to a naming
authority, which ultimately has people behind it. Perhaps you are
thinking that a registry might "own" an authority ID. I am assuming
that an organization runs the registry, and thus is ultimately in
control of its use. Is this consistant with your thinking? If not,
can you suggest an alternate statement regarding who/what controls the
use of authority IDs?
> Where does the 'global IVOA Name-granting registry' live? Who is going to
> maintain this and ensure it is always available? Can we not leave it up to
> the replication process to ensure non-duplication of AuthorityIDs?
I'm essentially suggesting a component of the interface (perhaps
optional) of a global registry that is capable of granting AuthorityID
requests because that registry is capable of determining whether the ID
is already in use. That means anyone who runs a global registry can
grant such names; there does not need to be a single one. Thus, it
does depend on the replication process employed by global registries.
The important point is, a local registry should not have to replicate
the world to ensure uniqueness of AuthorityIDs; it just needs to go to
a global resource.
(Sorry if my explanation was more complicated than necessary.)
> What is a 'publishing registry'?
My apologies: this is equivalent to what we've called a "local
registry". We've migrated to this name here in the NVO, I think,
because it is more indicative of what it does: it is for publishing,
not necessarily searching. At the global end, we've slipped into the
terminology of "searchable" registry.
> The rules for adding an AuthorityID to a registry should be left up to that
> registry. Why should IVOA set rules about this that it cannot police?
These are not meant to be rules for policing, but a recipe for
ensuring unique ownership and control.
> What are 'first-class resources'?
Sorry, poor choice of words. I just mean we should define a class of
resource called "Registry".
> Surely the only steps needed to resolve an identifer are:
>
> 1. Registry looks within itself for a matching identifier. If none is found
> and registry is *not* a 'full' (see http://ivoa.net/forum/registry/0193.htm)
> registry then the query is sent to a full registry.
>
> 2. If a full registry cannot find the identifier then 'resource is not
> found' message is returned.
>
> If an application (or, more likely, user) wants to try to dig further then
> they can do so by searching on AuthorityID, organisation ResourceID etc.
The recipe I spelled out is simply an extension of this (that is, our
1 and 2 are the same). By using a "full" registry to trace an
AuthorityID to the local registry where it originated means that that
registry need not really be "full"--that is, be a strict,
comprehensive registry for the entire VO. There may be local
registries that are locally targeted and don't want to be harvested
necessarily, but yet wants to request unique AuthorityIDs. This just
helps keep things distributed.
The main requirement proposed here is having the global registry
associate AuthorityIDs with the registry that originates them.
> Part B
> ======
>
> Surely unregistered ResourceIDs are nothing to do with the registry. If the
> owner of an AuthorityID wants to create unregistered identifiers for their
> own purposes then it is up to them to maintain them and ensure that they are
> not reused.
I think this is essentially what I'm saying in this proposed
amendment. Previously, the WD said that IVOA ResourceIDs are
guaranteed to be registered somewhere. This amendment simply loosens
this requirement by saying that it can be given to a registry for
resolution which may or may not be successful.
> Part C
> ======
>
> What is the point of the attribute 'persistent' when nothing can be done
> with it?
Currently, in the WD, identifiers are not necessarily persistent.
Resources may come and go, either deliberately or through neglect.
The persistent="false" would be a way for a resource to indicate
explicitly that the resource is transitory (and, for example, should
not be mirrored).
Perhaps persistent="false" is not needed. Should we just say the all
IVOA identifiers are persistent (thus, disallowing recycling of
identifiers)? I'm cool with this, too.
> The 'status' attribute makes sense as I can see people wanting to not search
> "inactive" resources. I'm not sure I agree about "deleted" status. I can see
> your reasoning but the registry is primarily a device for discovering
> resources; listing dead ones does not make sense unless we say that
> "deleted" resources are normally excluded from a search unless specifically
> included.
We plan to add relationships that include a reference to other
resources. Those other resources may disappear. If their metadata
descriptions are completely deleted from the VO environment, then we
will have dangling references.
In practice, the global registry may choose to remove a deleted
resource description from its database; however, the registry of
origin could continue to hold the deleted record. With the recipe I
outlined in part A, one still has the ability to track it down. My
example of a moved resource makes use of status="deleted"; it retains
its status as a separate copy from the original which has been
deleted.
Note that this practice of marking resources as "deleted" is not new
to digital library practice. This functionality is also supported by
OAI. Such records can be hidden from the view of users while still
maintaining unambiguous bookkeeping. We'll still end up with some
dangling references, I'm sure, and we'll have to live with that;
nevertheless, this provides a mechanism to avoid it.
> 'LogicalIdentifier' also sounds okay although I think the issue of mirrors
> is more complex than this can cope with.
>
> I don't think we need add more elements to the schema for a resource for
> mirrors etc. A repeating 'Relationship' element using namespaced values
> would be more efficient and extendable?
I think this would do the trick.
> Part D
> ======
>
> Changing the standard to ensure compliance with one other service could open
> the floodgates to other services. If we want to specify how ADEC can
> interoperate with VO Registries then it should be a separate document, a
> Note probably.
This part is mainly a recommendation to data providers in choosing
ADEC identifiers that interoperate well. To assist this
interoperation, I'm suggesting one minor change to the Identifier WD
specification of the URI form: the use of # to set off a component
that is to be ignored by registries.
Dataset resolution is going to be pretty important; referencing
datasets in journals is only one example likely to be common. The use
of the # in the URI form provides a means to refer to datasets that
are known not to be registered individually.
I also suggest to the ADEC that they develop the DataResolver service
as an open IVOA standard service. This would be a great asset to
all.
> The IVOA is not a policing organisation.
This is certainly not the intention.
> 1. attributes 'Status' and 'LogicalIdentifier'.
> 2. repeating element 'Relationship' as I've previously defined (point H in
> http://ivoa.net/forum/registry/0462.htm).
Would it help to go through what I've written and segregate the
components a bit?
* changes for the Identifier WD
* changes to the RM and extended schema
* requirements for registry service interfaces
* recipes for handling identifiers (that might appear in a Note).
cheers,
Ray
More information about the registry
mailing list