Featherweight Publishing Registries

Guy Rixon gtr at ast.cam.ac.uk
Sat Oct 22 11:48:03 CEST 2016


Markus,

the old AstroGrid implementation of registry does roughly what you propose for purx. Data providers register the Dublin-core metadata for their resources on a web form and then supply a URL for the registry to obtain the capabilities via VOSI. It doesn’t poll for changes in capabilities, but relies on the resource owners to go to the web-browser interface and poke it when they want to upload new capabilities. It also has the ability to manage multiple publishing authorities.

This software is rather old, of course, and has had limited maintenance since AstroGrid. However, VAMDC is using it and performing some maintenance. I’d say that it was adequate to run a publishing registry, but not a central registry (doesn’t support the modern query language for IVOA registry).

If anybody wants to try to run an instance, the legacy code for the service is available. Alternatively, I might be persuaded to run an instance at Cambridge in purx-like mode if there is interest.

Regards,
Guy

> On 22 Oct 2016, at 06:58, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> wrote:
> 
> Dear Registry folks,
> 
> On Fri, Oct 21, 2016 at 11:42:59PM +0200, Accomazzi, Alberto wrote:
>> Even if we decided that OAI-PMH is now getting a bit long in the tooth, I
>> would suggest considering more modern and widespread standards designed for
>> this purpose rather than reinventing the wheel.  Many of us (I think)
>> already support web crawlers by using the sitemap protocol (
>> http://www.sitemaps.org/protocol.html).  We could go even further and use
>> ResourceSync which simply builds on top of sitemap and is expected to be
>> the successor to OAI-PMH: http://www.openarchives.org/rs/toc
> 
> Given that OAI-PMH has at least one trap many implementors regularly
> step into (the dateUpdated in the record vs. the date a record
> actually appears in whatever the OAI-PMH service), I'd be interested
> in following the upstream developments, and I'd be happy if someone
> from the VO community could participate in whatever group pursues
> these (and perhaps report on them in our WG meetings during the
> interops).
> 
> Having said that, I don't believe this will scratch Walter's itch.
> While I believe that for a place the size of IRSA, a working OAI-PMH
> interface is highly desirable, I have indeed been planning for a
> while to offer a more lightweight process, in particular to
> operators that only run very few services but still want to keep
> the resource records on their local systems (rather than in the web
> interfaces offered by ESAVO and STScI): a proxy publishing registry
> (let's call it purx for now).
> 
> Essentially, the data providers would submit a URL purx pulls a
> resource record from, and after validation, purx puts this record
> into is ivo_managed resource records, so regular OAI-PMH harvesters
> will find it.  Purx then will, once a day or so, check if anything
> has changed on the remote side, and if so, re-download the record and
> push it out to incremental harvesters.  If the record becomes
> invalid, mails will be sent to the contact person in the registry
> record, if it vanishes, a deleted record will be pushed out by purx.
> 
> In that way, the harvesters still only need to talk to OAI-PMH
> services and not need hit hundreds or thousands of machines with a
> handful of records each, while small data centers don't have to
> bother with OAI-PMH and can still programmatically generate their
> resource records.  There is, of course, a small downside: The ivoids
> of the services will have to be managed by purx, and all of these
> records will be under one authority.  That would actually be by
> design: Registering and properly managing the authority record is
> another chore I'd rather spare the small data centers.
> 
> There are two reasons why this doesn't exist:
> 
> (1) I've always hoped someone else would build a service like this
> (any takers?)
> 
> (2) I never had a concrete candidate who'd be using it (although I
> strongly suspect that once it's there and properly documented,
> there'd be quite a few)
> 
> So -- if you can do something about (1) or (2) -- by all means do
> speak up.
> 
>        -- Markus



More information about the registry mailing list