Featherweight Publishing Registries
Walter Landry
wlandry at caltech.edu
Fri Oct 21 21:28:07 CEST 2016
Sarah Weissman <sweissman at stsci.edu> wrote:
> Do you know why the script is so slow? Is it because of an implementation
> flaw or is it because of the self-imposed Retry-after wait period that is
> built into the protocol? Or if you are storing all of your records as
> files on disk is it because of an IO bottleneck? I agree that the protocol
> is complicated, but it seems like there is no reason that transferring
> data via OAI-PMH should be much slower than any other protocol for passing
> data as XML records.
I believe it is slow because, for each request, the perl service has
to read all of the XML files. But I have not dug into the
implementation because I swore off Perl long ago.
> If you are proposing to switch to a model where each registry returns a
> feed of all its entries, without operations for subselecting based on
> dates for example, then I would suggest looking into using Atom
> syndication https://validator.w3.org/feed/docs/atom.html, which seems to
> be designed for exactly this purpose and is already an accepted and widely
> used standard on the web.
That is a little heavier than what I am suggesting. It requires
'title, 'updated' and 'id' fields for each element and for the
document as a whole. A minimal Atom feed would look like
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>NASA/IPAC IRSA Publishing Registry</title>
<link href="http://irsa.ipac.caltech.edu/"/>
<updated>2003-12-13T18:30:02Z</updated>
<id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
<entry>
<title></title>
<link href="http://irsa.ipac.caltech.edu/registry/2MASS/Catalog/CalMPSIT"/>
<id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
<updated>2003-12-13T18:30:02Z</updated>
</entry>
</feed>
I can not say that I would be ecstatic about making sure that I do not
mess up the 'id' and 'updated' elements, but it would sure beat the
current situation.
Cheers,
Walter Landry
More information about the registry
mailing list