Scope of registry

Tony Linde ael at star.le.ac.uk
Thu Feb 6 07:47:45 PST 2003


> We surely need to define what services the Registry should provide - 

This is key. In particular, we should define the boundary between a
registry and the archive/dataset/catalog/...  resources; between the
services that a registry provides and those of the data etc resources.

A registry should not be seen as a facade to the resources but as a way
of looking up those resources that will/may have the answer you want. You
will then invoke the appropriate resource service to perform your task
(eg querying a catalog for objects meeting some criteria). 

To do this we need to define, within IVOA, a *standard* list of resource
types and the *standard* metadata associated with each type of resource.
So, each resource of type 'catalog' which is listed will have a set of
metadata meeting some catalog-metadata schema, each resource of type
'archive' which is listed will have a set of metadata meeting some
archive-metadata schema, etc. (This may fit in with Roy's set of
different OAI formats - but I've not fully understood that).

These metadata standards will be a *minimal* set for each resource type.
A specific registry may hold more metadata on specific resource types -
thus having finer granularity than the standard. Each resource will
therefore also have to be able to answer some of the 'what type of data
do you hold', 'do you cover X patch of sky', ... to cope with the case
where it has been found by someone using a limited registry. The extended
metadata is likely to be used by service resources offered by the same
people who developed that registry but available to other service
providers.

Of course, a registry may 'pretend' to offer an extended set of metadata
on a resource type but, in implementation, will query the resource
directly when questioned about the extended metadata. This would only
work if, say, it was co-located with the resources and only listed those
resources. 

But all implementation details are up to the individual registry
developers.

Perhaps this is where some confusion comes in - some of us are talking
about the definition of IVOA standards and others about the
implementation of the US-VO specific registry.

Cheers,
Tony.

On Thu, 6 Feb 2003 15:05:51 -0000 , "Giaretta, DL (David) "
<D.L.Giaretta at rl.ac.uk> said:
> I agree with Tom up to a point.
> 
> We surely need to define what services the Registry should provide - 
> and recognise that we don't need to make an exhaustive list now because
> we
> should be able to extend it.
> 
> All registries need to sign up to this set of services - or at least be
> able
> to
> specify what sub-set it adheres to. If for any of its advertised services
> it
> does not 
> know the answer then it should "know an archive/registry who does", and
> delegate the provision 
> of the answer to it.
> 
> Any particular registry could adopt a caching policy with greater or
> lesser
> intelligence. 
> For example if it knows that a dataset is not being extended i.e. is
> frozen
> because no new
> observations are being made, then it can safely cache info about it for a
> long time. We can
> no doubt come up with a list of archive metadata which helps with this
> sort
> of optimisation.
> Of course some information may not be cacheable e.g. compute power or
> network bandwidth available 
> right now, which will vary from moment to moment - here the answer could
> be
> delegated to an 
> appropriate LDAP server. 
> 
> Where I differ from Tom is the view on the ease of implementation. He is
> almost certainly right in that
> an implementation _could_ be done very easily. However in order to obtain
> acceptable performance
> I would guess that a large number of optimisations would be utilised, and
> this would bring us to Clive's
> view in that any project that wants a really really popular registry
> would
> need to work hard - and here 
> an analogy with search engines and Google would be useful. Which project
> will end up providing the Google
> of the IVOA?
> 
> The implication from this is that we should define the initial set of
> services we need and the
> information (data and metadata) needed from archives, and perhaps as a
> separate activity 
> define archive metadata which will help in optimisations.
> 
> Cheers
> 
> ...David
> 
> -----Original Message-----
> From: Tom McGlynn [mailto:Thomas.A.McGlynn at nasa.gov]
> Sent: 06 February 2003 14:32
> To: Clive Page
> Cc: Arnold Rots; registry at ivoa.net; metadata at us-vo.org
> Subject: Re: Scope of registry
> 
> 
> One thing that has come to me in thinking about this
> issue is that there is potentially a difference between
> the granularity of a registry and that of a registry service.
> 
> Consider a registry as being
> a table having only a high-level (low-granularity) information
> about services.  The services themselves provide some protocol
> that gives fine-grainded information.  To give a concrete
> example, the registry might contain a reference to the
> Chandra archive, the NTT archive, and so forth.  Part of the
> information the registry has about the Chandra archive
> is its coverage service, which a user can invoke to get
> fine grained information about the position of Chandra observations.
> In some sense we might think of this as a registry hierarchy:
> an observation catalog is a 'registry' of the observations
> described.
> 
> However, there is no reason why a registry service that a user
> (or other software) might invoke, couldn't take advantage of
> both of the registry and the coverage services.  I could see this working
> something like DNS services on the Web.  When a domain name server
> is queried about some name it goes and queries a chain of services
> until it resolves the name.  When it's queried a second time for
> the same name, it uses a local cache.  Users tend to communicate
> with only a subset of internet nodes, so the relatively small local
> cache gives a local user almost the same benefit as if it had the
> full listing of all X billion web addresses.
> 
> Similarly when a registry service
> is queried about about observations in a given region the first
> time it looks at in coarse information to determine possible services
> and based upon that and other user criteria is goes off
> and gets fine grained information from the appropriate services.
> Since this information doesn't change rapidly, and people tend
> to be interested in the same regions of the sky, the registry
> service caches the fine grained information for a period of
> time of the order of hours or days perhaps longer for
> unchanging data sets.
> 
> We don't get static coupling of the various services with
> the registry, which we'd have if the registry itself contained
> the fine grained information, but the user is likely to get
> most of the speed advantage of having the data all in one place.
> I'd envisage the particular case of registries and coverage
> services as being a specialization of some more generic support
> for registry hierarchies.
> 
> If we can agree upon a standard protocol by which a archive
> gives the detailed information, then I think this approach
> will be easier for all sides: the data providers who provide only
> overview information to the registry, the registry
> builders who don't need to worry about synchronization of data
> and the users who get the latest information soonest.  I don't
> even think it will be very hard to implement in the registry services.
> 
> 	Tom McGlynn
> 
> Clive Page wrote:
> > On Wed, 5 Feb 2003, Arnold Rots wrote:
> > 
> > 
> >>So, it becomes a matter of degree.
> > 
> > 
> > Yes.  And I think the question of the granularity of information in the
> > Registry is a matter of debate.  It could be that different registries
> > have different policies.  The AstroGrid project has decided in principle
> > that a fine-grained registry is something to aim for (but maybe not in
> > version 1.0).  Others may have different aims.
> > 
> > It is clear that any query can be answered more definitively by firing
> > actual queries to each resource around the world, but the number of such
> > resources is getting quite large.  And we already know what happens when
> > you try that even on a limited scale: just use Astrobrowse to query the
> > set of sites they currently have listed and you find that even after a
> > minute or so not all the replies have come in.   A fine-grained registry
> > could, in principle, reduce the number of queries you need to send out
> > by quite a considerable factor (few observatories have observed more than
> > a tiny fraction of the sky, unless they have done systematic surveys).  I
> > think that would be nice to have, but I fully accept that it is not easy
> > to provide, so must be a matter for debate.  At least the debate has now
> > started.
> > 
> > 
> 
> 
__
Tony Linde                       Phone:  +44 (0)116 223 1292
AstroGrid Project Manager        Fax:    +44 (0)116 252 3311
Dept of Physics & Astronomy      Mobile: +44 (0)7753 603356
University of Leicester          Email:  ael at star.le.ac.uk
Leicester, UK   LE1 7RH          Web:    http://www.astrogrid.org

-- 
http://fastmail.fm - One of many happy users:
  http://www.fastmail.fm/docs/quotes.html



More information about the registry mailing list