RofR documentation, status
KevinBenson
kmb at mssl.ucl.ac.uk
Wed Jun 6 12:10:18 PDT 2007
Yes you are correct about point c.) I did not think about that.
I will ponder a little more about point b.) from what I can see you get
the same results with or without the 'set' parameter, the only
difference is when you don't use the 'set' you might get back a few more
'Standard' type Resources that are not vg:Registry. Unless I am missing
something.
One more thing. Your xpath you use in the document:
*|capability[xsi:type='vg:Registry']/interface[role='std' and
xsi:type='vg:OAIHTTPGet']/accessURL
I think the xsi:type should be 'vg:Harvest' for capability and
'vg:OAIHTTP' for the interface.
cheers,
Kevin
|*
Ray Plante wrote:
> Hey Kevin,
>
> Thanks! I was hoping someone could read it over before I posted it ;-)
>
> On Wed, 6 Jun 2007, KevinBenson wrote:
>> a.)
>> In section 3 does OAI have a 'since' typically always used 'from' or
>> 'until'?
>
> You are correct--thanks.
>
>> b.) Section 3 step 1. Wondering do you really care to harvest with
>> the 'set' parameter? The primary purpose of RofR is for other
>> Registries (such as Full Registries) to discover publishing
>> Registries to do harvests but a Full Registry will contain all
>> Resources, as stated in section 4.1 there might be other 'standard'
>> type Resources. Why not just do a ListRecords with no 'set'
>> parameter and get everything (then as you say in your steps after the
>> first time use the 'from' with no set parameter)?
>
> First off, you are welcome to do the harvest however you like; there
> is certainly more than one way to achieve the same end. However, the
> recipe described in the Note ensures that you get everything without
> having to sort through the results, particularly for duplicates.
>
> If you do not use set=, then you might subsequently, for example,
> ingest all of these records into your registry. Then you select from
> your registry all of the harvestable registries and proceed. This has
> three problems:
>
> 1) The RofR does not manage those Registry records. You should get
> these from the registry they describe if you want to guarantee
> that you have the latest version. The RofR will harvest these
> records at some interval, but it could be out of date. This is
> why you harvest CDS records from CDS and not the NVO.
>
> 2) When you harvest from each publishing registry, you will get its
> Registry record again. You will need to manage this somehow
> (presumably by just overwriting the one you ingested from the
> RofR).
>
> 3) Unless you encode as a special case to skip the RofR when you
> harvest from all of the publishing registries, then you will get
> the RofR records twice.
>
> Getting around these issues means by either sorting through results to
> avoid duplicates or ingesting records multiple times. If you find
> this easier to do (while avoiding errors) then that is fine.
>
>> c.)
>> Section 4.3 Prefer to setup smaller publishing registries that just
>> contains Resources it manages and that will certainly go into the
>> RofR, and had hoped the Full Registries that does have an OAI
>> interface but not really manage any Resources would not necessarily
>> need the 'vg:Harvest' interface would also find it's way into the
>> RofR. From this section I will need to make sure it has a
>> 'vg:Harvest' interface to place it into the RofR. Correct?
>
> The point of the RofR is to only register publishing registries.
> Thus, if you have a full registry that does *not* manage any of its own
> resources, then it does *not* need to be in the RofR and it does not
> *need* an OAI interface. (Note that the validater does not check the
> search interface.) You can throw on an vg:Harvest interface for
> kicks, if
> you like, but you shouldn't register it with the RofR as that would be
> a waste of other harvesters' time.
>
> Note however, that if the full registry is not a publishing registry,
> then you will need to register it with a publishing registry (one of
> your small ones) if you want its record to appear in other registries
> ;-).
>
>> You have a sentence: "Second, any full registry can serve this same
>> role, since it knows about all other searchable registries."
>> How does a full Registry know about these other searchable Registries
>> if they never made it into the RofR?
>
> As noted above, your full registry will need to be registered with a
> publishing registry. (Just place the record into one of your little
> registries.) The RofR was not intended as a way to find searchable
> registries.
>
>> I am for your counter-argument, my thoughts were a client application
>> that can find all the searchable Registries (such as via RofR) might
>> ask/tell the user 'Found a closer Registry. Would you like this to be
>> your main Registry?' But maybe we can tackle this at a later date and
>> time.
>
> Assuming that you have done the above, then you can set the AstroGrid
> full registry as the client's default. The client uses that one to
> find other searchable registries.
>
> thanks again,
> Ray
More information about the registry
mailing list