RofR

Mon Apr 11 10:38:00 PDT 2005

Hi all,

> I daresay I've misunderstood the RofR concept so could someone step through
> the registration/harvesting scenarios for me.

(I really like direct questions like this!)

The problem an RofR is intended to solve is two-fold:
  1) how does a new publishing registry let the world know it has records 
       to harvest? 
  2) how does a registry attempting to be "full" know who to harvest.  

I'll start with the first.  Using a browser, the administrator of a new 
publishing registry goes to a web page at www.ivoa.net and enters into a 
very simple form the OAI base URL for its publishing registry.  When 
submit is pushed, the server uses the OAI Identify function to pull over 
the publishing registry's Registry record.  (It could also do a quick OAI 
compliance check.)  The server adds the Registry to the RofR.

Now the second.  The RofR features an OAI interface which "Full" 
registries query regularly.  When a new publishing registry appears, all 
full ones find out about it and begin harvesting from it.  

Very simple, eh?

Now to comment on some of the issues that have come up in response.  

On Mon, 11 Apr 2005, Tony Linde wrote:
> Let's distinguish between full and publishing registries. A full registry is
> expected to contain all known resources, kept up to date via some harvesting
> mechanism; it exposes one or more query mechanisms. A publishing registry
> only contains a few local resources, kept up to date via some local
> mechanism; 

As I think others have pointed out, in reality, the two different types of 
registries are really two different roles.  That is, we have full 
registries that also publish records and, thus, can be harvested.  

> The original idea behind the owned/managed authIDs was that a publishing
> registry would have one or more owned authIDs and resources with identifiers
> under those authIDs. A full registry would also have owned authIDs but would
> 'manage' the authIDs for one or more publishing registries. The publishing
> registries would only have to support a simple push (update) or pull
> (harvest) interface with one full registry. Noone else would ever know about
> the publishing registry. The full registry would ensure that noone else
> registered either the owned or managed authIDs. Other full registries would
> only ever harvest from full registries, gathering resources under both the
> owned and managed authIDs.

This original mechanism was at best complicated, more complicated than 
necessary as the RofR idea would suggest.  The details are in the 
discussion of "ownedRegistry", I might summarize as follows:
  o  We're keeping track of who we harvest on a per-authorization basis, 
       but harvesting actually takes place on a per-registry basis. 
  o  The book-keeping of authority IDs is prone to error.
  o  It requires a human, "out-of-band" coordination to set up the 
       publishing-full relationship.
  o  It depends on a single-parent hierarchy, which has political and 
       availabiltiy issues associated with it.  

An RofR gets rid of all these complications.  

> Now the first problem with the RofR is that it contains access information
> about every publishing registry, which previously was hidden from world
> view. What is this going to be used for?

I don't recall a need to keep publishing registries hidden from the world.  
(The use of OAI is to make this stuff more visible.)  The owned/managed 
scheme was meant as a way of reducing the number of places a full registry 
needs to harvest from.  This might be advantageous if it were not 
simple to figure where all the publishing registries are.  RofR addresses 
this.  It assumes that the number of places to harvest from is not large.  
So what other reason is there to aggregate the harvesting?  

> Are all the full registries expected to harvest from *all* the publishing
> registries?  This may be a problem for sites which do not want wide-open
> access to their registry.

Because of the overhead of supporting the world?  Again, the argument 
behind RofR is that this is not a big deal today.  I would suggest that 
if/when it becomes a problem, let's adapt then.  I think it would be a lot 
easier to add aggregation of harvesting once we have the basics in place.  

On Mon, 11 Apr 2005, Roy Williams wrote:
> The problem then is that the owner of the RofR is perceived as being the
> center of the IVO, which is politically unattractive, unless you are
> part of the project that runs the central registry, in which case it is
> very attractive.

As a fan of de-centralization, the simplicity of this solution overrides 
any misgivings I might have.  (There really is very little to this 
RofR.  And note, end users won't see it.)  Again, I think we can add a 
decentralizing mechanism later.  

cheers,
Ray