table metadata and the registry

Ray Plante rplante at poplar.ncsa.uiuc.edu
Tue May 8 05:14:08 PDT 2007


Hi Tony,

On Tue, 8 May 2007, Tony Linde wrote:
>> I don't understand this.  Anybody who wants the fine-grained
>> information
>> can get it by following the URLs.  Anybody who doesn't want this
>
> But this is an enormous waste of time. I thought the VO was supposed to make
> things better. What you propose will mean that anyone who wants to provide a
> general query builder will have to query the registry for resources and,
> when the user selects a resource, find the service and query it for table
> information, then, for each table, query the service for column information:
> all while the user patiently stares at a spinning hourglass. This is not,
> IMO, an improvement on existing services.

This is not at all what I am proposing.  First of all, what I would like 
to see is that all of the table information be retrievable in one URL. 
(Multiple URLs might be allowed only as a means to enable very large 
collections of tables.)

If you want the AstroGrid registry's search interface to return an 
"expanded" VOResource that includes the table metadata for the benefit of 
your query builder, I think that is fine.  Others will probably argue that 
once you have located the resource, you should go to the service for the 
table metadata; even in this case, it is one extra call to access the URL 
that retrieves it directly from the service.

The important thing for registries is the form of the records we share
through the harvesting interface.  If these records simply have pointers 
to table metadata, then those registries that do not wish to manage this 
information don't have to.  The AstroGrid registry can pull the table 
metadata via the URL when the record is harvested.

This idea was conceived to fit well into what you are already doing.  For 
example, if we choose to use the table model from VOResource as the 
standard format, then it is trivial to pull the table metadata from the 
service and insert it into your internal copy of the VOResource record. 
You have to give me a little credit here :-)

>> (getRegistration() or getMetadata()) to include it in is the service
>
> I thought there was only one method to get metadata and it returned the
> VOResource record. I cannot see the need for more than one such.

See the latest VOSI doc from Guy.  The motivation is address the 
fine-grained metadata issue:  the former provides information intended for 
the registry and the latter is a fatter record.  So while I see the 
reason, I don't think it will accomplish its goal.

>> Furthermore, with no guideline as to what information should go in
>
> I would certainly mandate that the full VOResource record be returned with
> all the optional bits of that made mandatory.

How does this help the provider?  One of the reasons metadata are optional 
is because they won't necessarily apply to all resources.  And what is the 
point of making them optional in place when you force the user to provide 
it anyway in another.  This is not a recipe for quality metadata.

>> is all or nothing.  Not only does the URL solution allow a registry to
>> choose what fine-grained information it collects, but also it does not
>> require that that information fit into the VOResource format.
>
> Why would the registry care about non-VOResource information? And what use
> is a registry which cannot supply the information a calling service
> requires?

As a discovery service.  Some will argue that a client should be getting 
information like table data directly from the service when it plans its 
queries, but I don't want to prevent you from getting it all from your 
registry.

The general idea is that over time, we can develop discovery services that 
leverage more and more information about resources.  Not all of this 
information need be expressable in a VOResource schema.

> Do we now have to specify all the levels of metadata that a
> registry can and cannot supply?

While we may not agree at the moment on the best way to address the 
fine-grained issue, I hope we can at least agree on what the problem is.

To put it in concrete terms, the AstroGrid registry effectively pressures 
the NVO registries into supporting fine-grained table metadata needed to 
support your query builder but which we feel should be handled in a 
different way.  This pressure comes in two forms.  First, you encourage 
your publishers to provide table metadata.  We harvest these records which 
in turn go out to our users as a result of queries.  We have to then help 
users make sense of this information.  When there are problems with the 
information, it reflects poorly on us, not you.  Second, your application 
in effect encourages our publishers to provide table metadata to our 
registry if they are to be used in your application, because your 
application only gets this information from the registry.

We need to find a way that allows a registry like AstroGrid to innovate 
and provide new discovery and automated retrieval techniques that do not 
force other registries to follow suit.

> Do we now have to specify all the levels of metadata that a
> registry can and cannot supply?

We do need to have a common understanding of what qualifies as 
"fine-grained" information and develop mechanisms of exposing it only when 
desired.  I don't think we have this, yet, but I will offer my strawman at 
the meeting.

>> metadata.  A simple service (provided by a registry) can translate that
>> information into a standard format, so off the bat you get good
>
> How can the registry do that? None of the catalog services have *standard*
> ways of providing metadata: a registry will have to implement separate code
> for every potential service unless we specify new standards for these
> URL-based metadata retrieval methods.

SIA has a *standard* way of getting the table metadata: FORMAT=METADATA. 
A simple service that takes only an SIA base URL as a GET input can apply 
a stylesheet to return this information in a standard format.  The others 
have *standard* ways but they are all different.  A converter for each one 
provides a single way to get the table metadata from all of them.

> Bottom-line, Ray. I think what you are proposing is a radical change to the
> way the VO works. This turns the registry into a simple pointer to resources
> and puts the onus on VO applications to do all the searching for metadata,

I hope I have clarified that this is not what I am proposing.

cheers,
Ray



More information about the registry mailing list