Scope of registry
Giaretta, DL (David)
D.L.Giaretta at rl.ac.uk
Thu Feb 6 07:05:51 PST 2003
I agree with Tom up to a point.
We surely need to define what services the Registry should provide -
and recognise that we don't need to make an exhaustive list now because we
should be able to extend it.
All registries need to sign up to this set of services - or at least be able
to
specify what sub-set it adheres to. If for any of its advertised services it
does not
know the answer then it should "know an archive/registry who does", and
delegate the provision
of the answer to it.
Any particular registry could adopt a caching policy with greater or lesser
intelligence.
For example if it knows that a dataset is not being extended i.e. is frozen
because no new
observations are being made, then it can safely cache info about it for a
long time. We can
no doubt come up with a list of archive metadata which helps with this sort
of optimisation.
Of course some information may not be cacheable e.g. compute power or
network bandwidth available
right now, which will vary from moment to moment - here the answer could be
delegated to an
appropriate LDAP server.
Where I differ from Tom is the view on the ease of implementation. He is
almost certainly right in that
an implementation _could_ be done very easily. However in order to obtain
acceptable performance
I would guess that a large number of optimisations would be utilised, and
this would bring us to Clive's
view in that any project that wants a really really popular registry would
need to work hard - and here
an analogy with search engines and Google would be useful. Which project
will end up providing the Google
of the IVOA?
The implication from this is that we should define the initial set of
services we need and the
information (data and metadata) needed from archives, and perhaps as a
separate activity
define archive metadata which will help in optimisations.
Cheers
...David
-----Original Message-----
From: Tom McGlynn [mailto:Thomas.A.McGlynn at nasa.gov]
Sent: 06 February 2003 14:32
To: Clive Page
Cc: Arnold Rots; registry at ivoa.net; metadata at us-vo.org
Subject: Re: Scope of registry
One thing that has come to me in thinking about this
issue is that there is potentially a difference between
the granularity of a registry and that of a registry service.
Consider a registry as being
a table having only a high-level (low-granularity) information
about services. The services themselves provide some protocol
that gives fine-grainded information. To give a concrete
example, the registry might contain a reference to the
Chandra archive, the NTT archive, and so forth. Part of the
information the registry has about the Chandra archive
is its coverage service, which a user can invoke to get
fine grained information about the position of Chandra observations.
In some sense we might think of this as a registry hierarchy:
an observation catalog is a 'registry' of the observations
described.
However, there is no reason why a registry service that a user
(or other software) might invoke, couldn't take advantage of
both of the registry and the coverage services. I could see this working
something like DNS services on the Web. When a domain name server
is queried about some name it goes and queries a chain of services
until it resolves the name. When it's queried a second time for
the same name, it uses a local cache. Users tend to communicate
with only a subset of internet nodes, so the relatively small local
cache gives a local user almost the same benefit as if it had the
full listing of all X billion web addresses.
Similarly when a registry service
is queried about about observations in a given region the first
time it looks at in coarse information to determine possible services
and based upon that and other user criteria is goes off
and gets fine grained information from the appropriate services.
Since this information doesn't change rapidly, and people tend
to be interested in the same regions of the sky, the registry
service caches the fine grained information for a period of
time of the order of hours or days perhaps longer for
unchanging data sets.
We don't get static coupling of the various services with
the registry, which we'd have if the registry itself contained
the fine grained information, but the user is likely to get
most of the speed advantage of having the data all in one place.
I'd envisage the particular case of registries and coverage
services as being a specialization of some more generic support
for registry hierarchies.
If we can agree upon a standard protocol by which a archive
gives the detailed information, then I think this approach
will be easier for all sides: the data providers who provide only
overview information to the registry, the registry
builders who don't need to worry about synchronization of data
and the users who get the latest information soonest. I don't
even think it will be very hard to implement in the registry services.
Tom McGlynn
Clive Page wrote:
> On Wed, 5 Feb 2003, Arnold Rots wrote:
>
>
>>So, it becomes a matter of degree.
>
>
> Yes. And I think the question of the granularity of information in the
> Registry is a matter of debate. It could be that different registries
> have different policies. The AstroGrid project has decided in principle
> that a fine-grained registry is something to aim for (but maybe not in
> version 1.0). Others may have different aims.
>
> It is clear that any query can be answered more definitively by firing
> actual queries to each resource around the world, but the number of such
> resources is getting quite large. And we already know what happens when
> you try that even on a limited scale: just use Astrobrowse to query the
> set of sites they currently have listed and you find that even after a
> minute or so not all the replies have come in. A fine-grained registry
> could, in principle, reduce the number of queries you need to send out
> by quite a considerable factor (few observatories have observed more than
> a tiny fraction of the sky, unless they have done systematic surveys). I
> think that would be nice to have, but I fully accept that it is not easy
> to provide, so must be a matter for debate. At least the debate has now
> started.
>
>
More information about the registry
mailing list