Registry harvesting frequency

Sebastien Derriere sebastien.derriere at astro.unistra.fr
Tue May 29 06:08:28 PDT 2012


   Hi Petr,

   I take the opportunity of your message to raise a possible problem
that I discussed at the last interop, and that would get even worse
if harvesting frequency increased too much.
   With the OAI-PHM protocol, you can ask for resources "from=" a given
timestamp. A registry harvesting every day will usually ask form resources
"from=" present-harvesting-time-minus-24h. With a possible small extra
time to account for possible clock differences.
   A problem could appear if a resource metadata is updated, hence its
updated attribute is changed, but this change is not commited (for some
reason) to the registry OAI interface until some later time. Another
registry might simply miss this resource in the harvesting. Unless a
complete harvesting (with from=1900-01-01 or so) is done periodically
to check overall consistency.
   This is probably an error from the publishing registry that the "updated"
value is not automatically synchronized with the publishing interface, but
I suspect this might happen, so it's better if people are aware of this.

   A related situation is the following :

Pub Registry A creates a new resource at time 13:00:00.
Full Registry B harvests every hour, accounting for 1 minute
clock difference.
Hence, at 13:30:00, B will harvest from A for all resources created
or updated since 12:29:00.
But a temporary network error causes the harvesting to fail.
At 14:30:00, B harvests from A resources from 13:29:00. This time it
works but the resource is not within the time range. And registry B
might never see the new resource, if error handling is not done
carefully...

Sebastien

On 05/29/2012 02:25 PM, Mark Taylor wrote:
> My opinion: harvesting once a day is quite sufficient, once a week is
> not too bad.  Yes it would be even nicer if all changes showed up in
> all registries instantly, and if particular registry implementations
> find it convenient to harvest multiple times a day let them do so,
> but it doesn't seem to me a serious problem if publishers have to
> wait a bit for resource propagation.  Given the pull-model
> of harvesting it is not feasible that propagation is instant, so
> publishers who check a foreign registry as soon as they have
> published on a local one are bound to be disappointed anyway
> (and I certainly don't think this issue warrants a major architectural
> change).  On the timescale of preparing and registering a data service,
> an extra day or two, even a week, is not a long time.
>
> I am neither a data publisher nor a registry operator, so I don't have
> any special insight into this, but in my opinion it is not one of the
> more pressing issues facing VO registries.
>
> Mark
>
> --
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
>
> On Mon, 28 May 2012, Petr Skoda wrote:
>
>>
>>
>> Dear all
>>
>> Thank you very much for the prompt action of Christophe - now the record is
>> same in all registries seen in VOSPEC and SPLAT. So my personal needs are
>> fulfilled for the moment.
>>
>> However, with the little doubt I suggest to consideration for the whole
>> registry WG following:
>>
>> Do you think that even the update every day is enough for the DYNAMIC servis
>> on which depends all the VO functionality ?
>>
>> If I take the analogy with DNS - after I connect new computer to the internet
>> - do you think that the people would like to wait several hours (or days)
>> before seeing their computer on-line ?
>>
>> Why the server  coud not just check the changes in other registries and just
>> update the records on a much shorter interval ?
>>
>> I suppose the enthusiastic model of VO publishing ;-)
>> When someone goes through the filling of form (and succeeds) to publish the
>> service, he will be eager to see his fresh new service everywhere to check
>> immediately different applications etc ... He will probably need to show the
>> service to other people (with great proud - look at my new archive ;-)
>>
>> And IF he is forced to wait several days (even hours) to see the impact -
>> it is very discouraging and does not help to advertise the VO in the eyes of
>> advanced astronomers (who will register with proud their small collections) as
>> the modern dynamic global infrastructure.
>>
>> Maybe I am too naive - but in my opinion the current registry handling is
>> insufficient regarding the role it playes for success of all VO.
>>
>> That said I would like to encourage people to publish small data sets (mainly
>> spectra from smaller telecopes) in VO by emphasising the world-wide
>> visibility" of their observatory - i.e. "if you make your archive VO
>> compatible and register it, in few minutes all the world is going to query
>> your archive "
>>
>> I think the success of penetration of VO in everyday astronomy (or common
>> goal) is composed from small issues with huge synergic effect. And the
>> seamless functionality (and easiness) of the registration process is one of
>> those.
>>
>> Please reply as well to cc: to skoda at sunstel.asu.cas.cz
>> I am not subscribing registry WG matters and so far I was not interested in
>> registries as I supposed they work seamlessly but to be honest I was shocked
>> by disclosure of the real state ;-)
>>
>>
>> this is reply from Christophe:
>>
>>> Dear all
>>
>>> Following Petr's comment on Thursday, I confirm that the Euro-VO Registry is
>>> harvesting the VAO Registry once a week. We've re-run the harvesting
>>> procedure this (European) morning, so Petr's services that he registered in
>>> the VAO Registry should appear in VOSpec now (which is looking at the
>>> Euro-VO Registry).
>>
>>> Harvesting once a week was felt sufficient so far but we're now looking
>>> at the possibility to run it every day
>>
>>
>> *************************************************************************
>> *  Petr Skoda                         Phone : +420-323-649201, ext. 361 *
>> *  Stellar Department                         +420-323-620361           *
>> *  Astronomical Institute AS CR       Fax   : +420-323-620250           *
>> *  251 65 Ondrejov                    e-mail: skoda at sunstel.asu.cas.cz  *
>> *  Czech Republic                                                       *
>> *************************************************************************
>>


-- 

    (((    Sebastien Derriere     sebastien.derriere at astro.unistra.fr
   (. .)   Observatoire de Strasbourg       Phone +33 (0) 368 852 444
  (( v ))  11, rue de l'universite        Telefax +33 (0) 368 852 417
---m-m--- F-67000 Strasbourg  France


More information about the registry mailing list