Registry WG: Attention/Action

Tue Jun 17 09:07:37 PDT 2003

On Mon, 16 Jun 2003, Tony Linde wrote:

> Apart from 2-4 regulars, few people have been involved in recent
> discussions. We need you to get involved NOW.

OK, you can stop twisting now.  For what it's worth, my answers are these:

> 1. Should the RSM document be a Working Draft (WD), ie a document approved
> by this group as the basis for a future standard?

That seems a good way of making progress, but I think the term WD means we
are allowing ourselves to make fundamental changes later if it turns out
to be essential.

> 2. Should RSM (Resource and Service Metadata) be renamed to RM (Resource
> Metadata)?

As I tried to suggest earlier today, we appear have agreed to set up
everything in future as a Web Service (or even a Grid Service) so the
Registry should be just a registry of services - many of which will give
access to resources, so RSM should stand for Registry of Service Metadata.
Is that an acceptable compromise?

> 3. What structure should the next WD take: a flat one based on Ray's
> VOResource.xsd or a hierarchical one similar to the one's I've posted
> recently?

As Ray pointed out, his .xsd isn't flat, just flatter than Tony's.
To my mind Tony's proposed structure is a bit too deep: and I don't really
see the need for the CommunityClass or the perSpaceClass (maybe I wasn't
paying attention to the relevant earlier message).  Indeed I think we have
to keep names of actual people out of this as far as we can otherwise
those of us in EU countries are going to fall foul of our various data
protection laws (but that's a side issue, let's ignore that for now).

> 4. Should the metadata for a resource be unambiguous and each item named for
> its purpose or should we have a basic set of metadata which is used to fit
> requirements of different types of resource?

The former, I think.

> 4a. Should the basic resource metadata be based on Dublin Core or should
> metadata items be named for their astro meanings (and transformed to DC form
> if needed for DC-tools harvesting)?

I think we should use Dublin Core as far as it can be done easily, but
have been surprised at the number of items in RSM described as "not in
Dublin Core", so think we have to go our own way in many areas.

> 5. Should the group discuss the structure of resource metadata now and only
> issue a new WD when that discussion is more stable or should we issue a new
> version of RSM/RM and get people to build software based on that proposal
> and then discuss the structure?

It seemed to me that agreement on the structure wasn't too far off, so I'd
suggest having one additional attempt at getting agreement.

In Ray's recent message he said:
> Thus, I think it is important to leverage
> off of WSDL/Web/Grid service mechanisms as much as possible.

> It's the highest level information that is most difficult to fully
> automate.  The main reason is that this information, for most part does
> not currently exist in a structured form; in fact, much of it may only
> exist in people's heads.  It's not needed for local operation but is
> needed for interoperability.  Still, an approach featuring local
> creation and control is important to improve scaling.

> (Note: that WSDL does not have structures intended for capturing
> curation and capability information in a structured way--just
interfaces.)

I take those points.  But my suggestion that we pay attention to the need
for automatic harvesting was intended to mean that we ought to invent
suitable Web Services to provide information at these higher levels, and
for providing curation and capability information.   What I had in mind
was that the Registry Robot is given a top-level url, say something like
http://archive.stsci.edu/voservice
and that it works down the structure systematically, e.g.

1. Asks for all available services
2. For each service, asks for available service types (image-cutout
service, SIAP-service, or whatever)
3. For each of these it can ask for curational data
4. It can then delve deeper and ask for available catalogues or image
collections
5. For each catalogue it can ask for column details.

This implies that to be compatible with the VO the site needs to set up
suitable WSDL and SOAP services at various levels: at the top these merely
return information about the middle level services mainly for the benefit
of the Registry Robot, at the bottom they actually do a job (such as image
cutout) that an astronomer might need. This means that the Registry Robot
can build a Registry with lots of detail, if it delves deeply into each
site, or a simpler Registry if it stops only one or two levels down.

For this to work properly, of course, we need to define a number of top
level Web Services.  I don't know how difficult it would be to do that.
And it may be hard to work out how the Registry can handle information
from current CGI-style interfaces, which may persist for some years.

Regards

-- 
Clive Page,
Dept of Physics & Astronomy,
University of Leicester,    Tel +44 116 252 3551
Leicester, LE1 7RH,  U.K.   Fax +44 116 252 3311