Registry Interfaces 1.1 RFC

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Feb 8 14:55:03 CET 2017


Hi Walter,

On Tue, Feb 07, 2017 at 10:39:23AM -0800, Walter Landry wrote:
> Theresa Dower <dower at stsci.edu> wrote:
> I guess I missed this last time.
> 
>   In its Identify response, an OAI-PMH-compliant registry must declare
>   its support for deleted records. This can be one of
> 
>   no
>     - the registry will never notify harvesters of records that have
>       become unvailable. In an enviroment like the VO, where
>       searchable regiestries frequently harvest publishing registries,
>       this is severely discouraged, as without deleted records,
>       harvesters need to perform full harvests every time or risk
>       delivering stale records.
> 
> The total amount of data that is being transferred is tiny.  Is there
> really any need to discourage full harvests?  Retaining deleted
> records increases implementation effort.

Even if the amount of data isn't that great (I'd not call some 100s
of megabytes "tiny", though), processing and ingesting this stuff is
a non-trivial effort, so sure, even at the current size of the VO,
it's great if we *can* do incremental harvests.

The passage you quote says, however, that you don't *have to*
implement them if you don't mind the consequences; for a smallish
registry, that might be a perfectly valid decision.

I give you the language could be toned down a little bit, perhaps to
"...frequently harvest publishing registries, this should be avoided,
as it leaves harvesting registries the choice of doing frequent full
harvests or risk serving stale records."

So, while I'm usually the first to speak out against optional
features, in this particular instance I feel that this particular
case is a good place for optionalism: If "safe" incrementals are
worth it for you, implement them, otherwise don't.  Your clients can
easily figure out what you do.

In practice, I think all searchable registries regularly (me, every 6
months) do full re-harvests, by the way, so even if you don't
implement incrementals records you drop will eventually disappear.
I could be swayed to lower that interval for registries without
deleted records.

        -- Markus


More information about the registry mailing list