table metadata and the registry

Aurelien Stebe Aurelien.Stebe at sciops.esa.int
Tue May 8 07:51:18 PDT 2007


Hi Ray,

I just sent ESAVO point of view on the whole Metadata access replying to 
the thread started by Guy.
As you say RegWG2 session will be on this, I might be interested in 
presenting our proposal. Things are always much clearer with some graph.
It seems it might have parts in common with your proposal (good thing) 
and others that diverge, but this is unavoidable since the problem has 
started to be discussed in the open only very recently.
I like your idea here of URLs pointing to the Service Metadata access 
method (could be done with a simple URL to GET, or maybe even XLink).
I will not have anything to report in RegWG1 for ESAVO unfortunately, 
but I guess we'll have enough to talk about with the STC issue raised in 
January.

Cheers,
Aurelien

Ray Plante wrote:
> Hi WGers,
>
> I've been conversing with many of you regarding the general issue 
> referred to as fine-grained/rich vs. coarse-grained registries.  One 
> current manifestation of it is the issue of whether descriptions of 
> table columns should appear in the registry.  We see use cases 
> emerging that want to use this information for discovery and planning, 
> but handling this information in the registry raises costly curation 
> issues.  I would like to propose a solution to this issue that I 
> believe can serve as a model for handling other reputed fine-grained 
> information.  This solution will ultimately call for a standard format 
> for describing a set of tables.
>
> First, I recognize that before we can agree on whether to put 
> fine-grained information into the registry, we need a common 
> understanding of what qualifies as "fine-grained".  I have some ideas 
> on this that I will be presenting next week in the RWG session.
>
> The use cases that are driving table metadata into the registry are:
>   a)  Finding tables based on the columns.  A specific use case is to
>         build an SED from existing catalog data by searching for tables
>         that have columns described by certain UCDs.
>   b)  Automating the construction of specific queries to catalog
>         services for use within a workflow.
>
> One major reason that placing the column metadata in the registry
> is attractive is that the registry is an existing system for
> collecting the information and provides a common way to access it.
> One current problem with our existing catalog services (Cone Search,
> SIA, OpenSkyNode, and SSA) is each has a slightly different of
> presenting this information.  Thus, in practice, it is difficult to
> mine this information--you need 4 different methods.  For data
> collections that are described independently of any service that
> accesses them, there is no standard way of getting this information
> other than having it in the registry.
>
> I would like to propose we define a standard format for describing a
> set of tables and all their columns that can be served by a single,
> static URL.  With this, we can:
>   1)  Include this URL in the resource description of any table
>         service or data collection that includes tables.
>   2)  Define a simple GET method that can be part of a standard
>         service protocol to return this document.
>
> Implementation considerations:
>   o  More than one URL could be associated with a resource.  Thus, if
>      a service or collection serves many tables, their descriptions
>      could be distributed over several documents of manageable size.
>
>   o  While the information is packaged into individual documents,
>      a service can generate this information on the fly as necessary.
>      (For example, if TAP were to define separate "getTables" and
>      "getColumns" methods, the information could be aggregated via
>      internal calls to these methods.)
>
>   o  For existing "standards"--Cone Search, SIA, OpenSkyNode, (and if
>      necessary, SSA)--we could devise trivial HTTP GET services that
>      convert on-the-fly calls to their respective metadata methods
>      into the standard format.  These services could be provided by
>      registries.
>
> The advantages are:
>   *  The information originates at the service and is maintained by
>        the publisher.
>   *  There is a common way to get at the column information.  It is
>        not restricted to standard services but can be associated with
>        any data collection or custom service that handles catalogs.
>   *  The information can be obtained through the registry without it
>        being stored in the registry.
>   *  A registry (or other data discovery service) may harvest and
>        warehouse this information for the purposes of fine-grained
>        discovery; at the same time...
>        +  it does not require other registries to do the same, and
>        +  it does not require/encourage publishers to put this
>             information into the registry explicitly.
>
> The pressure for supporting the above use cases is large, so we need
> something quickly.  I would strongly recommend a v1.0 that is simple
> and based on existing formats.  I think either of two such options
> would work fine:
>   o  a profile on VOTable
>   o  the Catalog description model currently in the VODataService
>      extension schema used in the registry
>      (http://www.ivoa.net/xml/VODataService/v1.0).
>
> I also want to point out the Source Catalog Data Model, which some of 
> you may be familiar with.  Because of its emphasis on the astronomical 
> semantics more than table & catalog structure, it's probably not a 
> good candidate for the format itself.  However, it would be a good 
> model for annotating a table description via utypes.
>
> The point is to just support what people are already doing with the
> registry.  If we want to add more to the format (or even totally
> replace it), I recommend we save it for a save it for a subsequent
> version.
>
> So the general pattern for "fine-grained" information would be to have 
> VOResource records point to this information that is primarily managed 
> at the providers site.  Another area I would like to explore this idea 
> is in using detail coverage information to aid in discovery.  We 
> currently have a place in the VODataService schema (an extension of 
> VOResource) a place to point to a detailed footprint service.  We 
> would need to add a place to point to a table description.  Thus, 
> there is a critical time issue for putting this into place.
>
> I invite your comments, and I will raise this in Beijing during RegWG2.
>
> cheers,
> Ray
>
================================================================================================
This message and any attachments are intended for the use of the addressee or addressees only. The
unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its content
is prohibited. If you received this message in error, please delete it from your system and notify
the sender. E-mails can be altered and their integrity cannot be guaranteed. ESA shall not be liable
for any e-mail if modified.
=================================================================================================



More information about the registry mailing list