summary of recent RI discussion

Ray Plante rplante at ncsa.uiuc.edu
Fri Apr 8 14:20:58 PDT 2005


Hi RWGers,

I thought I would try to summarize what I think is the state of the 
various discussion threads.  I wasn't sure until I started looking back at 
the discussion, but we actually have made some progress.  Here's my 
summary, segregated by topic.

1. ownedAuthority ==> registry of registries

We ressurected the discussion of an <ownedAuthority> tag as part of a 
mechanism for determining what registries to harvest from and aggregating 
the harvesting at the national/regional level.  The various solutions 
posed suffered from a certain amount of complexity.  Bob Hanisch suggested 
that the IVOA sponsor a "registry of registries": a small registry 
containing only Registry records.  It provides a central place for 
publishers to register their publishing registries and full registries a 
place to find out how to publish from.  

The idea of aggregating the harvesting was dropped given the simplicity of
this new mechanism: the number of registries to harvest from might be in
the low tens.  Thus, <ownedAuthority> is not needed.  This idea also
eliminates the need for a harvester interface, which currently included a
harvest() function and possibly a getRegistries() function.

2. harvesting all vs. managed 

We all like the idea of being able to get all records from a registry or
just those that originated from that registry.  It was agreed to define an
OAI set called "ivo_managed" to harvest the latter subset.

3. registry record curation/stamping

There was a call for moving metadata describing the registry record itself
out of VOResource and into a parallel block to clarify the distinction.  
Included in this block would be the verificationLevel and harvestFrom
tags, along with some indication of the resources status/liveliness.  
Retrieval methods, then, would get back an XML record with a root element
the contained the VOResource block and the curation block.

While there was generally warm support for this idea, there are the 
following misgivings:

 * There is a question about who is allowed to set/change the values in 
   the curation/stamping block.  In the NVO perspective (where the 
   verificationLevel idea was developed), curation/stamping information 
   could be edited by any registry according to their local standards.  
   That is, they can choose what they think what records meet their 
   quality standards.  Others were concerned that a resource record was no 
   longer identical regardless of which full registry it is retrieved 
   from. 
  
 * There is a concern about metadata creep.  Having a new metadata block 
   opens the door for defining more metadata than what originally 
   motivated the change (i.e. verificationLevel) and subsequently is 
   really vital.  

 * This is a fairly big change in the XML that registries have to support.  
   Is the pain worth it?  

As more minor issues, we would still need to agree on exactly what 
information to include and what the tag names should be.  

4. ADQL/XPath ==> "RQL"

Paul Harrison posed the notion of rethinking our registry query 
language.  His two main concerns (and correct me if I have this wrong) 
are:

 * the complexity of XML; he feels we would be better served by a more 
   human readable syntax.  

 * the fact that we are tying the spec--and only partially at that--to 
   these other standards which can change and keeps us from specializing 
   to address our specific needs.  

One thing to come out of this discussion is a posting of requirements for 
the registry language.  There is no resolution at the moment, but perhaps 
there is a clearer understanding of the issues.   

----

Another general outcome was a general aggreement to change the created and
updated attributes in VOResource to support the full dateTime value type.

Through our flurry of email, we've more or less obliterated 3.5 of the 
discussion questions I posed on the twiki (that's pretty good), but we've 
raised at least two more: 
  1. how may resource records for the same resource differ from 
     registry to registry?  

  2. Should we proceed with our current query language plans (though 
     upgrade to the latest ADQL) or branch off from ADQL?  

Answering these will help point us to where we need to go on the remaining 
issues.  I'd like to suggest we repose these questions in digestable form 
to see if can gain a concensus.

thanks everybody!
Ray




More information about the registry mailing list