table metadata and the registry
Ray Plante
rplante at poplar.ncsa.uiuc.edu
Tue May 8 05:14:08 PDT 2007
Hi Tony,
On Tue, 8 May 2007, Tony Linde wrote:
>> I don't understand this. Anybody who wants the fine-grained
>> information
>> can get it by following the URLs. Anybody who doesn't want this
>
> But this is an enormous waste of time. I thought the VO was supposed to make
> things better. What you propose will mean that anyone who wants to provide a
> general query builder will have to query the registry for resources and,
> when the user selects a resource, find the service and query it for table
> information, then, for each table, query the service for column information:
> all while the user patiently stares at a spinning hourglass. This is not,
> IMO, an improvement on existing services.
This is not at all what I am proposing. First of all, what I would like
to see is that all of the table information be retrievable in one URL.
(Multiple URLs might be allowed only as a means to enable very large
collections of tables.)
If you want the AstroGrid registry's search interface to return an
"expanded" VOResource that includes the table metadata for the benefit of
your query builder, I think that is fine. Others will probably argue that
once you have located the resource, you should go to the service for the
table metadata; even in this case, it is one extra call to access the URL
that retrieves it directly from the service.
The important thing for registries is the form of the records we share
through the harvesting interface. If these records simply have pointers
to table metadata, then those registries that do not wish to manage this
information don't have to. The AstroGrid registry can pull the table
metadata via the URL when the record is harvested.
This idea was conceived to fit well into what you are already doing. For
example, if we choose to use the table model from VOResource as the
standard format, then it is trivial to pull the table metadata from the
service and insert it into your internal copy of the VOResource record.
You have to give me a little credit here :-)
>> (getRegistration() or getMetadata()) to include it in is the service
>
> I thought there was only one method to get metadata and it returned the
> VOResource record. I cannot see the need for more than one such.
See the latest VOSI doc from Guy. The motivation is address the
fine-grained metadata issue: the former provides information intended for
the registry and the latter is a fatter record. So while I see the
reason, I don't think it will accomplish its goal.
>> Furthermore, with no guideline as to what information should go in
>
> I would certainly mandate that the full VOResource record be returned with
> all the optional bits of that made mandatory.
How does this help the provider? One of the reasons metadata are optional
is because they won't necessarily apply to all resources. And what is the
point of making them optional in place when you force the user to provide
it anyway in another. This is not a recipe for quality metadata.
>> is all or nothing. Not only does the URL solution allow a registry to
>> choose what fine-grained information it collects, but also it does not
>> require that that information fit into the VOResource format.
>
> Why would the registry care about non-VOResource information? And what use
> is a registry which cannot supply the information a calling service
> requires?
As a discovery service. Some will argue that a client should be getting
information like table data directly from the service when it plans its
queries, but I don't want to prevent you from getting it all from your
registry.
The general idea is that over time, we can develop discovery services that
leverage more and more information about resources. Not all of this
information need be expressable in a VOResource schema.
> Do we now have to specify all the levels of metadata that a
> registry can and cannot supply?
While we may not agree at the moment on the best way to address the
fine-grained issue, I hope we can at least agree on what the problem is.
To put it in concrete terms, the AstroGrid registry effectively pressures
the NVO registries into supporting fine-grained table metadata needed to
support your query builder but which we feel should be handled in a
different way. This pressure comes in two forms. First, you encourage
your publishers to provide table metadata. We harvest these records which
in turn go out to our users as a result of queries. We have to then help
users make sense of this information. When there are problems with the
information, it reflects poorly on us, not you. Second, your application
in effect encourages our publishers to provide table metadata to our
registry if they are to be used in your application, because your
application only gets this information from the registry.
We need to find a way that allows a registry like AstroGrid to innovate
and provide new discovery and automated retrieval techniques that do not
force other registries to follow suit.
> Do we now have to specify all the levels of metadata that a
> registry can and cannot supply?
We do need to have a common understanding of what qualifies as
"fine-grained" information and develop mechanisms of exposing it only when
desired. I don't think we have this, yet, but I will offer my strawman at
the meeting.
>> metadata. A simple service (provided by a registry) can translate that
>> information into a standard format, so off the bat you get good
>
> How can the registry do that? None of the catalog services have *standard*
> ways of providing metadata: a registry will have to implement separate code
> for every potential service unless we specify new standards for these
> URL-based metadata retrieval methods.
SIA has a *standard* way of getting the table metadata: FORMAT=METADATA.
A simple service that takes only an SIA base URL as a GET input can apply
a stylesheet to return this information in a standard format. The others
have *standard* ways but they are all different. A converter for each one
provides a single way to get the table metadata from all of them.
> Bottom-line, Ray. I think what you are proposing is a radical change to the
> way the VO works. This turns the registry into a simple pointer to resources
> and puts the onus on VO applications to do all the searching for metadata,
I hope I have clarified that this is not what I am proposing.
cheers,
Ray
More information about the registry
mailing list