Featherweight Publishing Registries
Walter Landry
wlandry at caltech.edu
Sat Nov 5 18:09:39 CET 2016
Markus Demleitner wrote:
> On Fri, Oct 28, 2016 at 09:09:55PM -0700, Walter Landry wrote:
> > Just to be clear, Atom feeds are described by an IETF RFC [1], so it
> > is just as standardized as OAI-PMH. In addition, Atom feed clients
> > are ubiquitous, there are a wide variety of Atom tools, and, of
> > course, Atom has far, far larger adoption than OAI-PMH.
>
> ...but then it does something rather different. Unless we completely
> overturn the way the Registry has worked, we need both full and
> incremental harvesting, and I can't see how either is possible with
> Atom (where the originating server determines what records it puts
> into its feed, and the harvester has no way of selecting "all",
> "yesterday's", "last week's", or whatever -- right?)
Atom supports this by requiring an <updated> element. Fetching things
by time is a principle use case of Atom. So I am confused by your
statement.
> >From your other, Fri, 28 Oct 2016 07:37:35 -0700 (PDT), mail:
>
> > Harvesting Vizier's records takes more than a day. That does not fit
> > my definition of "works well". IRSA's implementation is also
>
> Nah, not at all. The whole VO Registry, including VizieR, can
> be fully re-harvested in deal less than an hour (ok, it takes a bit
> longer if you don't use sets=ivo_managed, but few components would
> have a reason to do that). Incremental harvesting takes minutes at
> worst. As a registry operator (both ends, publishing and harvesting)
> I'd maintain that it does work well.
Theresa Dower told me in Stellenbosch that it takes a day to fully
harvest Vizier. As another data point, we are doing some iterations
on our registry right now, and it takes 20 minutes for the RofR to
create a report.
I do not doubt that these times could be improved. Alberto
Accomazzi's experiences with arXiv show that it can be done. I am
sure that I could make a service that returns null updates in less
than 100 milliseconds. I would rather spend that effort on something
easier for publishing operators to implement [1].
> Anyway, we can talk here all day: It seems, Walter, that OAI-PMH is
> an itch that mainly you feel.
Apparently :(
Cheers,
Walter Landry
[1] As a not-so-random sample, there is exactly one package in Debian
stable that deals with OAI-PMH. In contrast, there are more than
50 packages that deal with Atom.
More information about the registry
mailing list