The IVOA in 2006: Assessment and Future Roadmap - Registries

Paul Harrison pharriso at eso.org
Fri Jun 9 02:42:17 PDT 2006


I have been informed that this message - the original in this  
particular thread did not make it to the list

Paul Harrison
On 07.06.2006, at 17:24, Roy Williams wrote:

> (5) Registry Implementation (Registry): As with many IVOA  
> standards, it is time to finalize the schema for the Registry to  
> enable a clear path to implementation. A new plan has been agreed  
> at the May 2006 Interop, that elaborates the idea of Service into a  
> family: the parent Service contains Interfaces and Capabilities.
> ·       We recommend that this change in registry schema should be  
> the last for a long time – at least the last schema change that  
> would invalidate old records.

though allowing schema extensions have always been part of the  
philosophy behind the registry so as to facilitate innovation - up  
until the Victoria meeting, this has caused problems, as the XML  
based registries have been easily able to support extensions and  
RDBMS based registries not. However, an important decision was made  
at victoria that *All* registries should return a complete registry  
entry including any extensions even if they allow searching only on  
elements of the core schema.

> ·       We also recommend that the registry WG define and reach  
> agreement on the scope of the registry in terms of the variety and  
> granularity of metadata. Registries can cache detailed metadata on  
> a regular basis, or maintain limited (but valid) metadata and fetch  
> detail only when required.

The Recent NVO usability report from StSCI highlighted the  
disappointment of the Astronomer-users with the Registry when trying  
to use it as a "Google" to find out information about a particular  
source - this is not (nor should be) the role of the registry - a  
higher level tool is needed - such as astroscope or datascope, that  
uses the registry as a starting point to discover resources that  
might provide more information about the desired astronomical object.  
Registries are there primarily to be information source for the co- 
ordination between other services and resources. End-user astronomers  
should not need to be aware that the registry exists....

As a centralized information source the registry can act as a cache  
for the fine grained information required to call the service, that  
can greatly speed up the end user experience.
The typical use case is if a user wants to make a query that  
implements some non-mandatory selection criterion on a SSAP service  
for instance - then for a coarse-grained registry the user tool needs to
1. query the registry to get a list of candidate services
2. query each service *in turn* to determine whether it supports the  
extra selection criterion

This can lead to a significant delay for the user, and obviously does  
not scale well as the number of deployed SSAP services increases. For  
a fine-grained registry the user-tool need only make a single query  
to determine which of the SSAP services can be successfully queried.

This does not necessarily impose a greater burden in creating the  
registry entry, as in fact only core registry metadata need be  
entered by hand and then the registry itself could query the service  
to fill in automatically the missing metadata. It is part of the  
GridWG "standard interface" for web services that a service returns  
its own registry metadata, and older standards like SIAP do also  
return this metadata in a different format. An implementor of service  
will already probably have to maintain some sort of mapping between  
his internal data model and that of the relevant IVOA standard, so it  
is very little extra work for him to provide metadata about this  
mapping in a standard way to the registry. In addition if he changes  
the facilities offered by his service (e.g. adds extra parameters to  
a SIAP query) he need only update the service and the registry entry  
will be updated the next time that the registry does an automatic  
update - this provides local curation of the metadata, which is the  
most natural place to do it.

The process of "web-crawling" the registered services to check if  
their metadata is up to date can be combined with regular service  
validation, which adds further value to the fine-grained registry  
approach. There is always the issue that the registry metadata are  
not up to date, but there is a datestamp on the registry entry that  
can be used to judge the likelyhood of a stale entry, and I suspect  
that even a single registry would only take a timescale in the order  
of hours for a registry to trawl all of the currently registered  
services so could be done daily, and the problem is already naturally  
divided between various registry deployments by only checking the  
services for which they are the publishing registry.

We can do better in the case of the required initial registration of  
resources as well. Each of the registries has their own  
implementation of a "maintenance portal" for manually entering  
registry records, and they are of considerably different quality -  
this is a waste of effort, as in principle one single portal could be  
built that could talk to any of the the registry implementations. I  
would suggest that the best of the existing portals be refactored so  
that it would be packaged in a standalone fashion, and then effort  
could be concentrated on that one tool to make it easy to use and  
provide all of the necessary validation and searching necessary to  
aid in creating good quality registry entries.

> ·       We also recommend that the “Registry of Registries” should  
> be created immediately and/or advertised on the IVOA website, even  
> if it is informal (a web page), so that information can be gathered  
> at the same time as the formal specification is built.
> ·       We hope to clarify and define closely the idea of  
> annotation/augmentation of existing registry records by an entity  
> that is not the author. We recommend that the Registry group  
> provide use-cases for this concept.
>
> (6) Registry Query Language (Registry, VOQL): Querying a registry  
> of services is rather different, semantically, from querying a star  
> catalog. The former may involve small data in complex schemas, and  
> the latter large data in simple schema. The star catalog query is  
> helped by specific language constructs (eg. Region of the sky) that  
> may mean nothing in the context of the registry query. We recommend  
> a sub-committee of the Registry and VOQL groups should examine the  
> case for and against a separate query language for registry, that  
> would be customized for registry queries and independent of future  
> development of the catalog query language.

standardization is good, but "one size fits all" can take things too  
far, and this I think is the case with trying to use ADQL for  
registry querying - I argued this a while back, see thread starting  
http://www.ivoa.net/forum/registry/0504/1300.htm - basically the aims  
of the query are too disimilar between catalogues and the registry,  
and in fact different customised extensions/modifications of the  
underlying SQL are required in each case. Basically,  I do think that  
it is now time to define a separate registry query language,  
particularly as it appears that there is a schism opening even within  
the VOQL community on what exactly should be part of the query  
language. The problem here, in my opinion, is again that the language  
and interfaces needed for services to be able to formulate queries  
amongst themsleves is not necessarily the same as the interface and  
language that the Astronomer user wants to formulate the query. There  
should be translation layers between these two levels that keep the  
interface definitions separate, but related.

The registry data model (for better or worse) has always been defined  
in terms of the XML Schema language, and there is a very natural  
candidate for a query language for XML, namely XQuery - however,  
although this is easy for the XML based registries to implement, it  
is difficult for the RDBMS based registries - at the Kyoto meeting it  
was agreed to make XQuery an optional query language.  It does have  
most of the richness required - allows complex search relations  
between different parts of the data model, and allows only the  
desired portions of the registry record to be returned. It does not  
have any specific "cone search" or "cross match" operators for doing  
astrometric selections on registry records, but as I argued above I  
think that it is probably up to a higher level facility to do this  
sort of thing. Anyway most XQuery/XPath implementations do allow you  
to add custom functions that could support these sort of operations  
on STC coverage information.

The adoption of XQuery as *the* registry query language would be  
problematic because of the RDBMS based registries - I do not have an  
easy solution - perhaps some very simplified form of XQuery would do,  
that could be easily translated into SQL - a principal requirement  
would be that the query itself was NOT expressed as XML though....

Paul Harrison



More information about the registry mailing list