new RI document

Kevin Benson kmb at mssl.ucl.ac.uk
Wed Jun 14 13:50:59 PDT 2006


I will try to add a small note in the Keyword Search section since this 
method has potentially different results between registries and mention to 
use the adql search if consistency and being exact is important.

Noel has a question of is the Keyword Search doing whole words or word 
fragments?  What is the concensus on this?  (I had sort of expected word 
fragments), but would be good to place this in the RI.

**I have to do some travelling tomorrow, but will try and put up a 0.8.4 
RI tomorrow afternoon with many/all of the recent comments.

cheers,
Kevin

On Wed, 14 Jun 2006, Ray Plante wrote:

> Hi Noel,
>
> If you don't mind, I'm going to chime in with my opinion :-)
>
> On Wed, 14 Jun 2006, Noel Winstanley wrote:
>> I've some questions with KeywordSearch (with my client-application-
>> developer hat on).
>
> (Thanks--we need that!)
>
>> Q: In the specification of keyword search, does it define which
>> fields /sections of the registry records are to be searched?
>
> Yes.  There are mandatory fields that must be search; however, registries
> may also match against other fields.
>
>> Q: is keyword search case sensitive?
>
> We should *recommend* case insensitive.  (Thanks.)
>
>> I think this might be desirable - or at least a recommendation on
>> which parts of the registry records should be searched. Otherwise, a
>> client may get different results for the same query when connecting
>> to different registries.
>
>> The reason I'm asking this is that, as an application developer, I'd
>> like uniform behaviour no matter what registry implementation my
>> application is connected to - otherwise it may well bamboozle users.
>
> Our aim is not to provide complete clones around the world.  Rather, we
> want individual projects to specialize to the needs of their (regional)
> community and well as innovate with new ideas that may eventually
> propoagate to other registries.
>
> So instead of expecting all registries to behave the same, clients
> would gravitate toward particular registries because their behavior is
> better suited to their needs.  For example, the astrogrid workflow tools
> would prefer to use an astrogrid registry because it supports searching of
> certain detailed information not supported by other registries.  Some
> client apps may allow the user to choose which registry to go to, while
> some won't care at all.
>
> I personally feel that the VO grid will be too dynamic to attempt
> identical behavior world-wide, particularly in registries which is about
> resource discovery (not analysis).  Resources will come and go.  It's more
> important to the science that individual resources--databases and image
> archives--provide a consistancy that preserves scientific integrity.
>
>> Q: If you are going this route, maybe the order in which results are
>> returned should be defined too - or at least recommended. (e.g. I'd
>> expect matches in 'title' to occur before matches in some other field
>> - such as 'description').
>
> This can be a complex question.  The "best" solution could be quite
> complex to implement for some registry implementations.  I would rather
> each registry deal with this in the way that is considered best for its
> users (convolved with what is practical to implement).  This could change
> easily over time as we better understand the problem and test solutions.
>
>> If keyword search on the registry interface is so loosely specified
>> (as, IIRC, it is now) that the behaviour can't be predicted, then
>> clients may well  be better off constructing their own adql/xquery
>> expressions that implement 'keyword search' for their users, but
>> under their own, predictable, terms.
>
> When exact and consistant results are important.
>
>> Actually, this sounds like the most sensible approach anyhow
>
>> In which case, would be best route be to simplify the registry spec &
>> implementations by removing keyword search altogether? Or replacing
>> it with a 'full text literal match search'
>
> Our current implementations, I think, have shown that the keyword search,
> even in its loosely-specified state, is useful.
>
> cheers,
> Ray
>
>



More information about the registry mailing list