new RI document

Ray Plante rplante at ncsa.uiuc.edu
Wed Jun 14 10:38:58 PDT 2006


Hi Noel,

If you don't mind, I'm going to chime in with my opinion :-)

On Wed, 14 Jun 2006, Noel Winstanley wrote:
> I've some questions with KeywordSearch (with my client-application- 
> developer hat on).

(Thanks--we need that!)

> Q: In the specification of keyword search, does it define which  
> fields /sections of the registry records are to be searched?

Yes.  There are mandatory fields that must be search; however, registries 
may also match against other fields.

> Q: is keyword search case sensitive? 

We should *recommend* case insensitive.  (Thanks.)

> I think this might be desirable - or at least a recommendation on  
> which parts of the registry records should be searched. Otherwise, a  
> client may get different results for the same query when connecting  
> to different registries. 

> The reason I'm asking this is that, as an application developer, I'd  
> like uniform behaviour no matter what registry implementation my  
> application is connected to - otherwise it may well bamboozle users.

Our aim is not to provide complete clones around the world.  Rather, we 
want individual projects to specialize to the needs of their (regional) 
community and well as innovate with new ideas that may eventually 
propoagate to other registries.  

So instead of expecting all registries to behave the same, clients
would gravitate toward particular registries because their behavior is 
better suited to their needs.  For example, the astrogrid workflow tools 
would prefer to use an astrogrid registry because it supports searching of 
certain detailed information not supported by other registries.  Some 
client apps may allow the user to choose which registry to go to, while 
some won't care at all.  

I personally feel that the VO grid will be too dynamic to attempt 
identical behavior world-wide, particularly in registries which is about 
resource discovery (not analysis).  Resources will come and go.  It's more 
important to the science that individual resources--databases and image 
archives--provide a consistancy that preserves scientific integrity.  

> Q: If you are going this route, maybe the order in which results are  
> returned should be defined too - or at least recommended. (e.g. I'd  
> expect matches in 'title' to occur before matches in some other field  
> - such as 'description').

This can be a complex question.  The "best" solution could be quite
complex to implement for some registry implementations.  I would rather
each registry deal with this in the way that is considered best for its
users (convolved with what is practical to implement).  This could change
easily over time as we better understand the problem and test solutions.

> If keyword search on the registry interface is so loosely specified  
> (as, IIRC, it is now) that the behaviour can't be predicted, then  
> clients may well  be better off constructing their own adql/xquery  
> expressions that implement 'keyword search' for their users, but  
> under their own, predictable, terms.

When exact and consistant results are important.  

> Actually, this sounds like the most sensible approach anyhow
 
> In which case, would be best route be to simplify the registry spec &  
> implementations by removing keyword search altogether? Or replacing  
> it with a 'full text literal match search'

Our current implementations, I think, have shown that the keyword search, 
even in its loosely-specified state, is useful.  

cheers,
Ray



More information about the registry mailing list