Rethink the Constraint-based search Query from Registry interface

Patricio F. Ortiz pfo at star.le.ac.uk
Thu Apr 7 15:56:37 PDT 2005


Hi Ray,

This description looks very comprehensive of what we could wish of a
registry oriented language. I like the idea of not thinking which
technology is behind (XMLDB or SQLDB or whatever else), but on what we
need to access the data. 

Just a couple of additions to both the string type and the numeric type:
I'll add the lines in the appropriate place to make it clear

On Thu, 7 Apr 2005, Ray Plante wrote:
> Hey Paul,
> 
> On Thu, 7 Apr 2005, Paul Harrison wrote:
> > OK - I will come straight out with it - I think that "2.1 
> > Constraint-based Search Query", the ADQL/XPath based registry query 
> > interface is an ugly compromise that suits no-body.
> 
> Okay, you're a brave soul.  
> 
> Before we get too deep into what's wrong with ADQL and XPath, let's step 
> back and look at requirements and constraints which presumably led to this 
> choice.  
> 
> Here's what I've extracted from Paul's last paragraph
> 
> PH.1.  We should be able to form complex queries described by:
>          o  constraints are specific attributes of the resource record
>          o  boolean expressions for combining constraints
> PH.2.  It should be human readable
> PH.3.  It should be simple (as simple as possible) but with the same 
>          semantics as is currently outlined in the RI spec.
> 
> Here are some other requirements I think we need:
> 
> RP.1.  It should be straight-forward to support using commonly used 
>        database technologies 
> RP.1.1   It should be straight-forward to support with both relational 
>             and XML databases
> RP.1.2.  It should be straight-forward to convert to local query languages
>             including XQuery and local variations of SQL.
> RP.1.2   It should be easy to parse in multiple, commonly-used languages
> 
> RP.2.  It should be able to support the VOResource (+extensions) data 
>        model.
> RP.2.1.  The query language should not include a definition of the data 
>          model (i.e. the keywords that are used to form constraints).  
> RP.2.2.  The query language specification should not need updating if the 
>          data model is change or updated.  
> RP.2.3.  The query language should require the use of specific attribute 
>          names internal to the registry.  (i.e. allow the use of RDB and 
>          XDB).
> RP.2.4.  There should be a clear connection between attribute names the go 
>          into the input query and the values that are returned in the 
>          result (which is XML using VOResource).  
> 
> RP.3.  Constraints should support comparison operators appropriate for the 
>        type of data.  
> RP.3.1.  For string values, comparison operators should include at a 
>          minimum:
>            o  equals
>            o  contains
>            o  starts with
>            o  ends with
	     o  does not contain

> RP.3.2   Case-independent comparisons must be possible for string values.
> RP.3.3.  For numeric types, comparison operators should include at a 
>          minimum:
>            o  equals
>            o  less than
>            o  less than or equal
>            o  greater than
>            o  greater than or equal
             o  not equal

It's possible that something like "has_value" and "has_no_value"
could be useful for both strings and numeric values in the future,
especially if one want to run verification tests on registry entries like
give me all entries which satisfy "has_no_value(Quality)"


> RP.4.    Users should be able to form constraints based on coverage 
>          easily.  (Ex: return resources that cover this region of sky.)
> 
> (Some may complain about weasel words like "should", "easy", and 
> "straight-forward"; while it is true these are difficult to test, we can 
> evaluate at some level different choices based on which are easier.  If we 
> were doing this formally, we would recast these in more concrete terms.)
> 
> Now, just to highlight how we got to section 2.1 as it is now.  The 
> advantages of ADQL:
>   o  the XML format means that it is broadly parseable in many languages
>        with existing tools.  
>   o  it has been demonstrated to convertable to both XQuery and SQL (with 
>        technologies like XSLT).
>   o  through its SQL roots, it provides all the capbilities in terms of 
>        operators and support for different value types.
>   o  it is intended to support region-based queries (using STC); we can 
>        leverage both the STC model and emerging software to support it.  

This part is going to be critical if I want to retrieve resources which
cover certain areas of the sky. Perhaps parallel services can take care
of this whether we include this feature as part of the language. In my
mind is querying for coverage around several regions, not just one.

[rest untouched]

Cheers,

Patricio



>   o  it provides a potential point of interoperability with other services 
>        that use ADQL.
>   o  there is ADQL/s for human viewing.  
>   
> The use of restricted XPath was motivated by:
>   o  standard attribute names do not need to be defined specially for the 
>      query language; they come directly from the XML entities of documents 
>      being searched.  (Thus, there's a direct connection between what you 
>      ask for and what you get back.)
>   o  restricted XPaths are simply long keyword names; this means that a 
>      simple lookup can be used to map them to internal attribute names.  
>      No internal parsing is needed.  (Thus, they translate easily into 
>      both SQL and XQuery queries.)
>   o  they support "non-standard" VOResourse extensions equally well as 
>      "standard" ones.
> 
> Now there may be other choices that satisfy the above requirements.  If
> anyone wishes to propose one, be sure that we address them and not rehash
> the same discussions that led us to ADQL.
> 
> cheers,
> Ray

---
Patricio F. Ortiz			pfo at star.le.ac.uk
Department of Physics & Astronomy	Phone: +44 (0)116 252 2015
University of Leicester			
Leicester, LE1 7RH, UK



More information about the registry mailing list