tables

Wed Sep 24 01:53:22 PDT 2003

Question: how do we handle resources which include tables? Or, more
generally, how does the registry assist with query construction?

I previously understood that a skyservice would address a data source which
consisted of (in overview) a single table with columns, each of which had
associated with it a unique column name and a non-unique ucd.

I now understand that a data source may be a collection of other data
sources, may be a collection of addressable but not registered datasets,
will have one or more services fronting it, some web services, some cgi and
some other implementations and that a data source may consist of more than
one table with columns having column names and, possibly, ucds.

A coarse-grained registry will presumably list the data source and service
and leave it to the user to find all the rest of the information. How will
this happen? I guess the data service has the ability to return table names,
column names and ucds - this seems to be what the SkyNode interface WD is
proposing - and the software will have to interpret these and plug them into
drop-downs for the user? Or does it just list it and the user has to type it
all in?

The big question for the coarse-grained registry is how much information is
held in the registry to allow the user to *find* the data source. Does it
include every UCD in the underlying tables? Does coverage and content
include the coverage and content of every data source, regardless of the
gaps? And what happens if the service is down when the user wants to
construct their query?

A fine-grained registry would hold all the above metadata, keeping it up to
date through harvesting, and any software query builder can get that
metadata for populating the query forms. How much such information is there
going to be? Will these registries become unmanageable? I guess 'suck it and
see' is the only answer to that. 

With a fine-grained registry, the user will find the exact data source to
match their requirements and can construct a query without reference to the
actual service itself.

Each of the fine- and coarse-grained registries has advantages and
disadvantages. AstroGrid has chosen to follow the fine-grained route while,
from what Bob has said previously, I assume NVO will follow the
coarse-grained route. This is good as we can compare the two approaches.

One key question is how replication and harvesting are going to work. A
coarse-grained registry will want to only pick up the top level details from
a fine-grained registry, while a fine-grained registry will need to harvest
the top level info from coarse-grained registries and then harvest the lower
level detail from services themselves.

One thing we will need to do is extend the schema to define the metadata
structures and content for tables with their column names and ucds. This is
not just to cater for the fine-grained registry but to ensure the metadata
returned from the services which hold this information is standardised.

Cheers,
Tony. 

__
Tony Linde                       Phone:  +44 (0)116 223 1292
AstroGrid Project Manager        Fax:    +44 (0)116 252 3311
Dept of Physics & Astronomy      Mobile: +44 (0)7753 603356
University of Leicester          Email:  ael at star.le.ac.uk
Leicester, UK   LE1 7RH          Web:    http://www.astrogrid.org