OAI as VO Harvesting Interface

Roy Williams roy at cacr.caltech.edu
Tue Sep 9 11:04:11 PDT 2003


Ray et al

There is a lot of activity in the OAI community to build a web services
(SOAP) version.

I have copied Carl Lagoze on this -- he is a prince in the OAI world.
Perhaps he would know best what is the status of the SOAP version of OAI.

Roy

--------
Caltech Center for Advanced Computing Research
roy at cacr.caltech.edu
626 395 3670

----- Original Message -----
From: "Ray Plante" <rplante at poplar.ncsa.uiuc.edu>
To: <registry at ivoa.net>
Sent: Tuesday, September 09, 2003 10:43 AM
Subject: OAI as VO Harvesting Interface


> Hi,
>
> As promised, here are a few more words about OAI as a Harvesting
> interface.  First for those not familiar with OAI, you can find a
> discussion of the use of OAI within a VO registry framework via
> http://rai.ncsa.uiuc.edu/~rplante/VO/metadata/evaloai.html.
>
> The major advantages of using OAI:
>   1.  it is an existing, field-tested standard that we do not have to
>          reinvent.
>   2.  we can leverage existing tools for harvesting
>   3.  it allows us to expose our records beyond the VO community.
>
> 1 and 2 are the strongest arguments for using OAI.  2 was important for
> both prototype publishing registries at CalTech and NCSA.  (The OAI
> support in our VORegistry-in-a-Box package is an existing CGI script from
> Virginia Tech; we just added the specific support for our resource
> metadata.)
>
> The primary disadvantage is that the OAI interface is not a SOAP-based Web
> Service; it's defined in terms of HTTP Gets.
>
> A possible variation that gets around the disadvantage and retains some of
> the advantages is to define a WSDL version of the OAI operations.  (I
> believe Gretchen & Wil did something like this.)  This would allow us to
> reuse the OAI design.  It would not be difficult to use a generic
> HTTP-Get-to-Web-Service adapter layer which would allow us to fully retain
> the advantages of 2 & 3.
>
> Clive brought up an interesting point about supporting the hierarchical
> nature our resources.  In the OAI model, each resource it exposes is
> described by a node of XML data.  To the interface, there is no inherent
> support for hierarchical relationships; this is encoded, if necessary,
> within the domain-specific metadata.  The recently proposed resource
> metadata provides ways to express various relationships between resources.
> (In my opinion, I would use the "Manager" item to refer to parental
> containment, but others may disagree.)  The important point is, the OAI
> model for harvesting doesn't need to know about hierarchies; it's just
> about synchronizing the contents of one registry with another.
>
> Does the Harvesting interface need to know about hierarchies?  You might
> answer yes if you want to control harvesting based on where resources
> exist within a hierarchy (e.g. only harvest three levels deep).  I suspect
> that in practice, this will not be so important.  Given that OAI can
> control harvesting based on registry-defined catagories, date, and
> metadata format, it is likely that sufficiently similar filtering can be
> done through another mechanism.  I'd been interested to hear ideas to the
> contrary.
>
> All in all, I see tremendous advantage to adopting OAI for Harvesting in
> some form.  If we don't adopt it outright, we should at least borrow
> heavily from it as it addresses the common issues associated with
> efficient synching of information systems.
>
> cheers,
> Ray
>
>
>
>
>



More information about the registry mailing list