Draft Registries Requirements for Cambridge Meeting

Mark G.Allen allen at newb6.u-strasbg.fr
Fri May 9 05:29:07 PDT 2003


Hi Everyone,

Here is the draft Registries Requirements document intended as
a starting point for discussions at next week's Registries WG session
at the Cambridge interoperability meeting.

It tries to cover most of the current issues, with a very draft version
of the requirements, including the draft NVO registries requirements
and use cases.  Discussion in Cambridge should fill this out to
a proper draft requirements document.

It is included in full below, and is also available on
http://www.ivoa.net/twiki/bin/view/IVOA/IVOARegWp02


-Mark.

-------------------------------

Mark G. Allen
Observatoire de Strasbourg
11 Rue de l'Universite
Strasbourg 67000
FRANCE

email: allen at astro.u-strasbg.fr
phone: +33 3 90 24 24 87
fax:   +33 3 90 24 24 17
-------------------------------


Rwp02 Toward Registry Requirements
===================================
May 09, 2003.


Intro
------
The Registry kick-off meeting, London 19-20, 2003 set up a
number of working groups for development of IVOA registries.

Rwp02 - Requirements, Science Cases, Use Cases, Test Cases
has the charter to focus on science case scenarios, and use cases
related specifically to VO registries. In particular, to develop
use cases which will help define requirements for VO registries.
( webpage: http://www.ivoa.net/twiki/bin/view/IVOA/IVOARegWp02 )

This document represents the current status of Rwp02, including:

+ Work Plan
+ Scope
+ Working Conceptual Definition of a Registry for Science,
+ Broad Operational Requirements
+ Key Science Case Candidates
+ Requirements Imposed by Science Cases
+ Use Cases extracted from Science Cases

The intention is that this draft provides a starting point for
discussion at the Registry Working Group session at the Cambridge
Interoperability Meeting, May 12-16, 2003.
( Registry WG session is Wed, May 14, 0900-1300, 1400-1700)


Work Plan
---------

- Define Scope draft v0.1 11 April        - posted to IVOARegWp02
- Key Science Cases draft v0.1 11 April   - posted to IVOARegWp02
  ( includes a brief review of VO
    Science Cases in relation to
    registries )

- Use Cases draft v0.1 25 April           - not done
- Requirements draft v0.1 09  May         - this draft document

(Cambridge Meeting May12 -16)

- Requirements drafts v0.2 28 June
  Science Cases
  Use Cases
  Test cases


 Scope of Work Package
----------------------
1. Identify Key Science Cases to be used as drivers for defining
   registry requirements. These should be illustrative of the range
   of requests that may be sent to a regsistry.

2. Describe a set of Use Cases, which are representative of envisaged
   registry usage, and include the use cases required to execute the
   Key Science Cases.

3. Define a set of actual queries against which registry implementations
   may be tested.

4. Requirements. Make a set of requirements which are necessary to be
   able to execute the Use Cases, and will be useful in actual development
   of registries.



Working Definition(s) of Registries
-----------------------------------

A registry is a queriable service/resource that responds with a structured
description of other services/resources.

Registered services/resources are described by various types of metadata
as outlined in RSM v6.

Registries are a way of narrowing down the search for resources to a
manageable  subset. It is not intended that a registry replicates all
data into a central repositry, but rather a condensed set of metadata.

A registry is a dynamic database of metadata describing a set of
Internet-available resources.

    A registry is used to identify and locate resources satisfying
user-specified criteria, and to direct more detailed information requests to
the relevant services.



Broad Operational Requirements
------------------------------
(Adopted from NVO draft registries requirements - R. Plante)


A.  Registration Contents:
    1.  Descriptions of Resources (Resource metadata as described in RSM v6)

    2.  Descriptions of Services:
    +  how to invoke the service, including inputs and outputs
    +  other characteristics of the service; in particular, metadata
       about the kind of data returned by the service.
        +  compliance details when the service is meant to be an
       implementation of a standard service.

    3.  The metadata structures supported by the registry should be
        consistent with those used by IVOA at large. 

    4.  Need to support a hierarchical notion of resources in order to
        describe sites that manage multiple collections, missions,
        services, etc.


B.  Registry Queries
    1.  Resources and Services can be searched for based on characteristics.

    2.  Query results returned in machine-interpretable form

    3.  Can search for:
     +  resources
     +  services of particular types

    4.  It should be possible to uniquely retrieve a description of a
        resource or service either via a unique ID or via a small and
        predictable set of metadata. 

C.  Registration Process and Registry Evolution

    1.  It should be easy for data/service provider to register a new
        service.  It should not require one to resubmit information
        about the resource that was registered before for another
        service.

    2.  It should be easy or automatic to update the metadata associated
        with a resource or service. 

    3.  It should be easy to unregister a resource and service.     

    4.  The registry should expect that registered services may
        become temporarily unavailable. 

    5.  The registry should account for the possibility that a service
        will become permanently unavailable without it being explicitly
        unregistered.

    6.  The registry system shall support classes of services sharing
    common characteristics where these shared characteristics can
    be specified once and then referred to from the implementing
    services.  Updates to shared characteristics can also be made
    using a single update request to the registry.


This is based on the idea that registries are used to support coarse
searching for resources that might have what the user wants; a definitive
search would accomplished by querying the candidate resources directly.


Key Science Cases
-----------------
Key science cases to be used as drivers for defining registry
requirements. These are intended to be illustrative of the range
of requests that may be sent to a registry.

Following a review of the current science cases across the
various VO initiatives,
(http://www.ivoa.net/internal/IVOA/IVOARegWp02/KeySciCaseReview.txt)
a 1st draft set of science cases has been selected.

Selected Science Cases:

1. 
AstroGrid: Brown Dwarf Selection
NVO: Select Dwarf Galaxies by Colour for Observational Follow-up
 
Chosen as representative of parameter constrained catalog search
scenarios.

2.
AstroGrid: Deep Field Surveys

Chosen as representative of a data search scenario, with use of
coverage (spatial and temporal) constraints.

3.
NVO: Gamma Ray Burst +
     Chosen because NVO is planning registry specific developments
     for this science case to turn their GRB demo into a more
     general "show me the sky" tool

4.
NVO: Find Super Novae Pre-Burst Observation
     Chosen as representative of 'find all data at this point'
     type scenarios. (Perhaps this is the same as the GRB general tool)

5.
AstroVirtel: Luminosity functions of Star Clusters in Nearby Galaxies
             Chosen because of detailed requirements for registry
             functions.

See: http://www.ivoa.net/internal/IVOA/IVOARegWp02/astrovirtel_use_case.pdf


Requirements Imposed by Key Science Cases
-----------------------------------------

At present the requirements imposed by the science cases
is only in the form of a list of registry requests
drawn from the science cases. This is meant as a
representative list, not in any way complete.
 
It is envisaged that this should lead to a heirarchical list
of science metadata that needs to be in a registry. Most
importantly what is the minimal set.

Working list of example registry requests::

 Catalogs relevant to Galaxy Clusters
 Catalogs with coverage of I,K,R at given locations
 Catalogs of dwarf galaxies with color or magnitude measurements

 Identify resources relevant to Galactic Clusters
 Identify Deep Field Survey Resources

 Identify which parameters can be queried for a given resource

 Data Provider Services requests on
  - coverage - space, time, energy
  - resolution, field of view, pixel scale, limiting
  - calibration, data quality
 
  - requests for data of unique provenance (not multiple datasets
    or catalogs derived from the same observations)



...need help here to expand list and make into requirements...


Interpretive capabilities.
--------------------------
A number of the science case scenarios imply a kind of interpretive
capability for registries. Some of these are familiar, but others
are much more challenging, and it is not clear if they belong as
a registry requirement. ( Rather the requirement on the registry might
be that the metadata is listed in such a way as to allow construction
of interpretive capabilities as registered services )
 

The most familiar of these is Coordinate transformations and
object name resolving. These are often the starting points for
searches. Current archives/catalog browsers handle this fairly
well so it is assumed that reistry will be able to do the same.
(In the registry framework it is often assumed this will be
 handled by a coordinate/astronomical name resolver registered
 service)

Higher level interpretive capabilities assumed in some of the science
case scenarios include the ability to, for example, search by REDSHIFT,
not only returning results where redshift is explicit, but also results
where REDSHIFT may be recognised and calculated using explicit VELOCITY
values. This implies that astronomical concepts and their relationships
are somehow encoded within a registry.

Further, there has been a suggestion for  "Expert Knowledge Registries" 
into the registry so that a registered service "knows" :
"what is a galaxy", "what is a deVaucouleours profile", "what is an
elliptical galaxy", etc..

The issue of international language support has also been mentioned.


 - Having interpretive capabilities actually within a registry
   seems to implies the need for library of functions for the registry.
 - The complexity could easily get out of hand, so suggest a minimal
   set of functions to do Coordinates and Name Resolving.
   (As implemented in GLU for example)

   or ... only have interpretive capabilities as calls to registered
          services. 



Use Cases
---------

 + Use cases extracted from Key Science Cases

      - requirements for registries of different granularity
   ...

 + Categorized Use Cases
   (adapted from R. Plante use cases)
 

 1)  Locating Data Collections

    Find collections that may contain desired data.

    Return possibilities
      *  the type of resource (e.g. archive, survey, catalog )
      *  Collection's Home Page
      *  a description of supported (access) services

    Possible Search criteria
      *  the type of resource
      *  type of data included (images, spectra, catalog, ...)
      *  frequency waveband
      *  sky coverage
      *  time coverage
      *  the types of services supported (SIA, Cone search, etc.)
      *  descriptive keywords

2) Locating Services

   A. Data Access Services

      Example: Find all optical, ground-based image archives that
              support SIA.
           Find all mosaicing services that cover the north
              galatic pole.

      Return:
        *  service description (describing supported inputs, outputs,
         and access URL)

      Possible Search criteria:
        *  type of service
    *  type of data included (images, spectra, catalog, ...)
        *  frequency waveband
        *  sky coverage
        *  time coverage
        *  descriptive keywords
    *  access type (i.e. what level of authorization required;
         e.g. public vs. authenticated or proprietary)

   B. Generic/Processing Services

      Examples: Find a name-resolver
        Find a coordinate converter

      Return
        *  service description

      Possible Search criteria:
        *  type of service
    *  input/output objects
    *  descriptive keywords
    *  access type (i.e. what level of authorization;
         e.g. public vs. authenticated or proprietary)

3) Locating Resources

   Return
     * description of resource
     * contact information
     * URL to Resource's home page

   Possible Search criteria:
     * resource name or identifier
     * descriptive keywords



Registry Science Metadata
-------------------------
The proposals for resources metadata in RSM v6 and eleswhere
have been compared against some sceintific criteria.

See. Anita Richards:
http://www.ivoa.net/forum/registry/0200.htm








More information about the registry mailing list