RofR documentation, status

KevinBenson kmb at mssl.ucl.ac.uk
Wed Jun 6 12:10:18 PDT 2007


Yes you are correct about point c.)  I did not think about that.

I will ponder a little more about point b.) from what I can see you get 
the same results with or without the 'set' parameter, the only 
difference is when you don't use the 'set' you might get back a few more 
'Standard' type Resources that are not vg:Registry.  Unless I am missing 
something.

One more thing. Your xpath you use in the document:
*|capability[xsi:type='vg:Registry']/interface[role='std' and 
xsi:type='vg:OAIHTTPGet']/accessURL

I think the xsi:type should be 'vg:Harvest' for capability and 
'vg:OAIHTTP' for the interface.

cheers,
Kevin
|*
Ray Plante wrote:
> Hey Kevin,
>
> Thanks!  I was hoping someone could read it over before I posted it ;-)
>
> On Wed, 6 Jun 2007, KevinBenson wrote:
>> a.)
>> In section 3 does OAI have a 'since' typically always used 'from' or 
>> 'until'?
>
> You are correct--thanks.
>
>> b.) Section 3 step 1.  Wondering do you really care to harvest with 
>> the 'set' parameter?  The primary purpose of RofR is for other 
>> Registries (such as Full Registries) to discover publishing 
>> Registries to do harvests but a Full Registry will contain all 
>> Resources, as stated in section 4.1 there might be other 'standard' 
>> type Resources.  Why not just do a ListRecords with no 'set' 
>> parameter and get everything (then as you say in your steps after the 
>> first time use the 'from' with no set parameter)?
>
> First off, you are welcome to do the harvest however you like; there 
> is certainly more than one way to achieve the same end.  However, the 
> recipe described in the Note ensures that you get everything without 
> having to sort through the results, particularly for duplicates.
>
> If you do not use set=, then you might subsequently, for example, 
> ingest all of these records into your registry.  Then you select from 
> your registry all of the harvestable registries and proceed. This has 
> three problems:
>
> 1)  The RofR does not manage those Registry records.  You should get
>     these from the registry they describe if you want to guarantee
>     that you have the latest version.  The RofR will harvest these
>     records at some interval, but it could be out of date.  This is
>     why you harvest CDS records from CDS and not the NVO.
>
> 2)  When you harvest from each publishing registry, you will get its
>     Registry record again.  You will need to manage this somehow
>     (presumably by just overwriting the one you ingested from the
>     RofR).
>
> 3)  Unless you encode as a special case to skip the RofR when you
>     harvest from all of the publishing registries, then you will get
>     the RofR records twice.
>
> Getting around these issues means by either sorting through results to
> avoid duplicates or ingesting records multiple times.  If you find
> this easier to do (while avoiding errors) then that is fine.
>
>> c.)
>> Section 4.3  Prefer to setup smaller publishing registries that just 
>> contains Resources it manages and that will certainly go into the 
>> RofR, and had hoped the Full Registries that does have an OAI 
>> interface but not really manage any Resources would not necessarily 
>> need the 'vg:Harvest' interface would also find it's way into the 
>> RofR. From this section I will need to make sure it has a 
>> 'vg:Harvest' interface to place it into the RofR.  Correct?
>
> The point of the RofR is to only register publishing registries.
> Thus, if you have a full registry that does *not* manage any of its own
> resources, then it does *not* need to be in the RofR and it does not
> *need* an OAI interface.  (Note that the validater does not check the
> search interface.)  You can throw on an vg:Harvest interface for 
> kicks, if
> you like, but you shouldn't register it with the RofR as that would be
> a waste of other harvesters' time.
>
> Note however, that if the full registry is not a publishing registry,
> then you will need to register it with a publishing registry (one of
> your small ones) if you want its record to appear in other registries 
> ;-).
>
>> You have a sentence: "Second, any full registry can serve this same 
>> role, since it knows about all other searchable registries."
>> How does a full Registry know about these other searchable Registries 
>> if they never made it into the RofR?
>
> As noted above, your full registry will need to be registered with a
> publishing registry.  (Just place the record into one of your little
> registries.)  The RofR was not intended as a way to find searchable
> registries.
>
>> I am for your counter-argument, my thoughts were a client application 
>> that can find all the searchable Registries (such as via RofR) might 
>> ask/tell the user 'Found a closer Registry. Would you like this to be 
>> your main Registry?' But maybe we can tackle this at a later date and 
>> time.
>
> Assuming that you have done the above, then you can set the AstroGrid
> full registry as the client's default.  The client uses that one to
> find other searchable registries.
>
> thanks again,
> Ray



More information about the registry mailing list