Rwp03: RSM changes

Ray Plante rplante at poplar.ncsa.uiuc.edu
Tue May 27 09:50:56 PDT 2003


Hi Tony,

> I think that's enough for people to work on over the weekend :)

It was a three-day weekend here in the States, so I didn't think about 
nuthin'!

> 2. In 2. Architecture, rather than an 'organization', I'd propose a
> 'community service'. Such a resource would point to a service which provides
> information about organizations, groups, funding bodies etc. 

I'm wondering about your use of the word 'service' as "in its generic 
sense".  Do you see this generic sense consistant with the definition 
given in the document?  Do we need to alter the definition?  

All resources are described by a ReferenceURL metadatum that points to a
human-readable document describing the resource.  For a service, this is
in addition to any URL used for accessing the interface.  For things like
organizations or data collections, these are the home pages for those
things.  Is this the sort of thing you had in mind?

I sense from your suggestion a desire to look at resources as services.  
This may be a reasonable perspective, but I don't think it is intent of 
the architecture described in the RSM.  Organization is a resource that 
can be described separately from any of the services it provides.  I 
prefer this simpler view of an organization.  Thus, in the architecture of 
the RSM, all services are resources, but not all resources are services.  

> I don't think
> the registry should be a 'list of everything', only of resources which
> provide a realizable service to the VO (using service in its generic sense)
> - ie, returns data, displays spectra, merges VOTables etc. Otherwise it'll
> become impossible to manage and cumbersome to search.

Do you mean impossible for the user or the registry maintainer? 

As you might recall from my presentation, the VOResource XML schema 
identifies several classes of resources--organization, project, data 
collection, and service; additional classes are expected to be defined.  
These classes, I believe, would in practice map well to the kinds of 
queries that users would put to the registry:  "Find me all data 
collections that ..." or "Find me all services that ...."  Thus, from 
the user perspective, the class (in addition to the Type metadatum) can be 
effective in filtering out irrelevent resources from a query.

The class essentially defines what additional metadata (beyond the 
top-level resource metadata) are used to describe the resource.  (Again, 
see my presentation, 
http://www.ivoa.net/internal/IVOA/InterOpMay2003ResReg/VOResource.ppt).  
The fact that the metadata employed depends on the resource type does 
makes it harder to stick them into a RDB--you can't use a single flat 
table very well.  Nevertheless, I believe this heterogeneity is 
inescapable; just looking at the different services we have to 
support bears this out.  Each standard service will have its own set of 
metadata for describing its capabilities.  I think, though, the VOResource 
structure supports this variation well.

Getting back to the issue of listing "non-services", I would guess that 
direct searches for Organizations by users would be rare.  Still, I think 
allowing the registering of Organizations will play an important role in 
tracking provenance and integrity.  The PublisherID associated with all 
resources, including services, would allow one to find out more 
information about the data collection(s) serviced by a particular service.  

Another "non-service" I think would be good to register is a standard 
service interface definition (e.g. SIA).  This means that implementations 
can reference the definition in their service descriptions, and the ID can 
be used to recognize services of a particular type.  The registry record 
describing the definition would include pointers to the specification 
document (via the ReferenceURL), the generic WSDL document (if 
applicable), and the service-specific schema used to describe 
implementations.  This idea is actually alluded to in the defintion of the 
ServiceStandardURI metadatum.  

> 5. The only other top level item, included in the Content Metadata should be
> Interface metadata (currently section 4.1). This should describe the mode of
> invocation: how to access and utilise the resource.

Again, I get the sense that you wish to view all resources as services.  
If Services are only a type of Resource, then this should not be part of 
the top-level metadata.  

> 6. Finally, each resource should provide a list of 'Supported Metadata
> Formats' (SMFs). This will be a list of namespaces, each of which refers
> (but doesn't point) to a metadata schema which that resource supports, ie
> the resource describes itself using those schema.
> 
> 7. The rest of the metadata held on a given resource fulfils its contract to
> support the list of metadata formats. So a data conversion would provide
> metadata about input format and output format; a data query resource would
> provide metadata about query format, data content and output format; etc.
> [Should only be a standardised list of metadata formats - if we allow anyone
> to generate their own it'd be difficult for user services to cope with them.
> Perhaps the Type field in the content metadata should indicate a set list of
> SMFs which are mandatory and the resource can choose to support others???]

This SMF idea is very interesting (it is one of the features of the OAI 
protocol); however, I think I need a clearer idea of your motivation.  Do 
you expect that people will want multiple ways of describing themselves 
(i.e. saying the same thing with different schemas), or are you trying to 
address the issue of different resources types will need a different set 
of vocabulary to describe themselves?  

If it is the latter, this is, I think, handled transparently by the
VOResource schema framework using XML mechanisms.  For example, I envision
that a description of an SIA service will use the VOSIA schema, which
extends the VOResource schema.  This would be a standard schema in that it
is part of the SIA specification (that is registered!).

Perhaps you could give an example of the situation you want to address.  

> 8. The RSM mentions the idea of hierarchies of resources (eg MAST/HST). This
> needs discussion and resolution as to how it is handled. If the hierarchy is
> allowed, it will be possible to return from a registry query with both the
> MAST and HST resource within the answer. We need to either constrain the
> registry to only low-level resources or to indicate hierarchy in some way
> and set a rule (or user-settable switch) that the registry only returns the
> lowest level from any hierarchy.

The hierarchical idea is one that I feel needs strengthening in the 
VOResource metadata and probably, therefore, in the RSM.  

> If people agree with the above I think we need to start defining the SMFs. I
> suspect these might be hierarchical when it comes to resources which query
> data. The top level data content SMF would support generic information like
> coverage, while lower level SMFs would provide data-specific metadata, so a
> query resource would always support the top level SMF but will pick from
> lower level ones.

I feel like this is essentially the roadmap that WP03 is on.

cheers,
Ray




More information about the registry mailing list