More on getCapabilities and the general issue of metadata queries

Doug Tody dtody at nrao.edu
Thu May 10 11:05:29 PDT 2007



---------- Forwarded message ----------
Date: Thu, 10 May 2007 12:01:21 -0600 (MDT)
From: Doug Tody <dtody at nrao.edu>
To: Tony Linde <Tony.Linde at leicester.ac.uk>
Cc: 'IVOA Registry WG' <registry at ivoa.net>
Subject: Re: Alternative proposal

Hi Tony -

This sounds promising.

I would go one step further and suggest that the registry has primary
responsibility for defining, curating, and using high level resource
metadata.  So for example if we want to relate two or more resources,
or augment the resource metadata for a service with information such as
the service validation level (which should not come from the service
itself), or a status up/down marker, relations to related resources,
etc., this is best done at the registry level.  For many types of
resource the registry may be the only way available to manage high
level resource metadata, so such capabilities will have to be there
in any case.

While in some cases we might want to permit service developers to
remotely control or upload resource metadata via some programmatic
interface to the registry, I don't think this is the right way to
do this in general.  For service and data-related metadata yes, this
could come only from the service, but not high level resource metadata.

For finding resources, metadata describing service capabilities or
data characteristics such as table/column information may be useful
for selection.  The key thing I want to stress here, is that selection
is only one way to use such information.  A client application talking
directly to a service may have other uses for such metadata, and
may require different or more detailed metadata than is required at
the registry level for selection.  Hence while we want to be able to
support registry-based selection using such information, or related
use-cases such as caching of metadata for efficient workflows, this
should not drive the specification of the content of this metadata,
or how it is obtained from the service.  At the service level we need
to take all these different use-cases into account in specifying the
metadata, and best to make it accessible.

> Since a lot of the discussion has been about tabular data, I'd suggest that
> (4) the first selection spec be based on the needs of tabular data services 
> and be developed in conjunction with DAL.

This would be a good place to start.  Lets also not forget the need
to agree upon the scope of getCapabilities, as this is required now
to complete certain DAL interfaces.  We would like getCapabilities
to be useful both for direct introspection of a service by a client
application, as well as provide a way to get service metadata into
the registry to enable registry-based selection (science data-related
metadata will.almost certainly require a separate interface).

 	- Doug


On Thu, 10 May 2007, Tony Linde wrote:

> (this affects DAL and GWS but I assume relevant people are here anyway)
> 
> Ray's proposal attempts to minimise the amount of fine-grained metadata in 
> the registry and needing to be harvested. My proposal is a little more 
> off-beam and not really thought all the way through (or fully checked with AG 
> colleagues <gulp>).
> 
> Basically, I reckon the registry is about finding resources. So the registry 
> needs to contain both the location of resources and the info necessary to 
> determine the right resources. In addition, it needs housekeeping info such 
> as the curator, publisher etc. Let's call these location, selection and 
> curation metadata.
> 
> I don't think there is any argument that we need the location metadata and 
> little that we need the curation metadata. The dispute is about the selection 
> metadata. I would propose therefore that
> (1) registries only contain and harvest location and curation metadata: what 
> I will call core metadata.
> 
> Obviously this would make registries of little use to scientists or 
> applications (ie, clients generally). So, I propose that
> (2) the Registry WG form a subgroup to get started on creating a Resource 
> Selection specification: this to cover:
> (2.1) selection store: types of information needed per resource type (details 
> worked out in conjunction with the group which writes the resource spec);
> (2.2) selection service: basic interfaces required and how/if information is 
> replicated.
> 
> In order to locate selection metadata in the first place,
> (3) the core metadata might include pointer(s) for retrieving the selection 
> metadata (if consistent with the resource type), the number and format of 
> those pointers to be determined in conjunction with the group which writes 
> the resource spec.
> 
> Since a lot of the discussion has been about tabular data, I'd suggest that
> (4) the first selection spec be based on the needs of tabular data services 
> and be developed in conjunction with DAL.
> 
> That's basically it for my proposal but I'd like to add some other thoughts 
> that I think come out of this:
> 
> (5) there will obviously be discussion about the distinction between 
> location, curation and selection metadata but if we keep those as the main 
> ideas, we can decide what works best for each type of resource.
> 
> (6) we ought not to assume that there is a one-to-one relationship between a 
> resource and a block of selection metadata. It could be that several 
> resources are described by the same block and several blocks refer to a 
> single resource. This has implications for the pointers mentioned above.
> 
> (7) the structure of the registry and its contents and how those are 
> specified (xml schemas and extensions) does not need to change: this approach 
> was worked out over a long time and we do not need to start that discussion 
> again. I'd suggest sticking with this _approach_ in determining the selection 
> store format.
> 
> (8) this approach will allow the development of a wider range of resource 
> finding services. A project which only wants to implement a registry can do 
> so with little effort. Ditto for those who offer a finding service (I can 
> think of at least one such service). But it is also possible that a service 
> offers *both* the registry interface and the selection interface and maybe 
> some enhanced combination of services. As with Ray, I've always wanted to 
> make sure projects are free to develop radically new approaches to VO-based 
> astronomy and I hope this idea helps with that.
> 
> And the most important point of all:
> 
> (9) we do not change the current way the registry works but put all our 
> energy into getting this proposal mapped out and implemented. If we push 
> hard, I see no reason why we cannot get all this complete by the autumn 
> meeting.
> 
> Okay - time for everyone to tear my ideas to shreds...
> 
> Cheers,
> Tony.
> 
>



More information about the dal mailing list