Harvesting stylesheet

Ray Plante rplante at poplar.ncsa.uiuc.edu
Thu Feb 28 03:26:03 PST 2008


Hey

On Thu, 28 Feb 2008, Matthew Graham wrote:
> Partially to prove that it can be done, I've written an XSLT stylesheet that 
> will harvest from all registries in the Registry of Registries and then write 
> each resource record out to a separate file with each registry having its own 
> directory. Oh and it's only 81 lines long!

This is great!  Would you be willing to post this on the IVOA twiki page,
RegUpgradeToV10?

As you know, I'm a tremendous fan of XSL, and this is indeed the approach 
I initially took in my own software; however, I ultimately ran into a 
problem harvesting, coincidentally, from Carnivore.  The problem was that 
some namesspaces that contained definitions of xsi:type values (e.g. 
sia:SimpleImageAccess) were being defined in the OAI header section.  This 
utlimately led to these namespaces not being passed on to the output.

So does this work with Carnivore?

I've been developing a reference harvester, IVOAHarvester, available via 
http://trac.us-vo.org/nvo/wiki/IVOAHarvester.  All functionality is not 
complete, so the package is currently only available via svn check-out. 
This package forms the basis of the RofR's validater, which harvests and 
checks each record from a registry.  It ultimately does the extraction of 
the VOResource records in Java (VOResourceExtractor) and is anal about (1) 
getting all of the namespaces correct, (2) preserving the spacing and 
namespace prefixes used in the original OAI record.

Personally, if the stylesheet approach works, I would certainly recommend 
using it, as I think it should be a simpler approach.  However, if you 
have problems with it, there is also this IVOAHarvester package.

cheers,
Ray



More information about the registry mailing list