REGION

Iñaki Ortiz de Landaluce Inaki.Ortiz at sciops.esa.int
Wed May 9 02:23:40 PDT 2007


Hi All

I tend to agree with Alex to build the REGION definition as a function 
returning a boolean rather than adding new comparison operators. I feel 
that the region definition and the columns description should be tied 
together as they influence each other. The algorithm resulting from the 
ADQL translation depends on the region type, the coordinates system and 
the operator/verb. Therefore I think it's better to have everything 
encapsulated into one single entity.

Jeff did send some comments on this issue a few months ago. He was 
basically proposing either building a function-like region predicate or 
using the derived table approach. In the last draft of the specification 
I did sketch a temptative region predicate definition consistent with 
Jeff's first approach, which I personally think is fine (See appendixes 
A and B). As you can see it is not complete at all: it misses the bits 
of the STC (and its usage, which is not trivial at all!). The relevant 
syntax rules were as follows:

<predicate> ::= <comparison_predicate>| <between_predicate> | 
<in_predicate> | <like_predicate> | <null_predicate>
| <exists_predicate> | <region_predicate>

<region_predicate> ::= <table_reference> <period> REGION
<left_paren> 
<region_specification> [ <comma> <column_reference> [ { <comma> 
<column_reference> }... ] ]
<right_paren>

<region_specification> ::= !! Region definition from STC schema here

I believe all this fits with Alex's suggestion. We would just need to 
qualify the REGION keyword with a verb from a list of possible values. 
This list could be extended but I would propose, for the time being, to 
include INTERSECTS/OVERLAPS (and CONTAINS?) only. This could be 
described in BNF as follows:

<region_predicate> ::=
REGION<period><region_verbs>
<left_paren>
<region_specification>
[ <comma> <column_reference> [ { <comma> <column_reference> }... ] ]
<right_paren>

<region_verbs> = INTERSECTS | CONTAINS ...

Therefore, the missing bit would just be the <region_specification> 
syntax rule definition, for which we should ask for inputs from the STC 
experts, at least for circular and rectangular shapes. In any case, 
Alex's example of the circular region, which I guess comes from the old 
ADQL, looks good to me.

Inaki


Alex Szalay wrote:
> I have an other concern about how we set out to do REGION. In the WHERE
> clause of SQL every item is a Boolean expression which evaluates to true
> or false. These have the form of value OPERATOR (value|set), the latter 
> for the case of IN. In any case, the left hand side of a where clause is
> always a scalar value evaluated from the next tuple. In our current 
> REGION syntax we have a list of columns on the left hand side, not a scalar,
> quite different from anything else in the SQL WHERE.
>
> After being negative, here is a suggestion, that came up today in a
> discussion with Benjamin. He did not quite like it, but I think it is worth
> mentioning it. This may have come up before I was involved in the VOQL, and
> discarded, but let me try it anyway. These are of course rough examples.
>
> Consider a boolean built-in REGION function, with possibly a variable 
> number of arguments. We can then write a spatial query like
>
> 	SELECT objid, ra,dec FROM Catalog c
> 	WHERE REGION.CONTAINS('CIRCLE ICRS 185 32.5 0.5','ICRS',c.ra, c.dec)
> or	
> 	SELECT objid, ra,dec FROM Catalog c
> 	WHERE REGION.CONTAINS('CIRCLE ICRS 185 32.5 0.5','ICRS',c.x,c.y,c.z)
> or 
> 	SELECT objid, gl,gb FROM Catalog c
> 	WHERE REGION.CONTAINS('CIRCLE ICRS 185 32.5 0.5','GII',c.gl,c.gb)
>
> One can also do the "get all regions which contain the point" query. Assume
> that we have a table of regions. Then
>
> 	SELECT regionid FROM regionTable r
> 	WHERE REGION.CONTAINS(r.regionDef,'ICRS',185,32.5)
>
> We can also have an intersect function, which can say if two regions
> intersect, like
>
> 	SELECT regionid FROM regionTable
> 	WHERE REGION.INTERSECTS(regionDef,'CIRCLE ICRS 185 32.5 0.5')
>
> These are all automatically valid SQL statements, provided that the 
> mapping of the actual parameters on to the precise definition of
> the function is correct. One can even think of implementing this
> directly... and extending it later is trivial (REGION.DISJOINT, etc).
>
> The . notation is also arising in the INFORMATION_SCHEMA.* cloud, so it
> is not unlike existing stuff. By the way, in SDSS we have 30,000 regions
> stored, and I believe that Gretchen has over 100K in her HST exposures
> database. So I think we will store regions in the database. 
>
> Let me know what you think. Sorry for the two page size.
>
> --Alex
>
>
> -----Original Message-----
> From: owner-voql-teg at eso.org [mailto:owner-voql-teg at eso.org] On Behalf Of
> Patrick Dowler
> Sent: Monday, May 07, 2007 2:06 PM
> To: 'VOQL-TEG'
> Subject: Re: REGION
>
>
> Many small comments interspersed...
>
> On Saturday 05 May 2007 00:40, Alex Szalay wrote:
>   
>> This is a good start, but I think that we need a much clearer focus.
>> Also, after reading this I still feel confused what a REGION datatype is.
>> I will try to keep my comments short.
>>
>> In a typical spatial framework there are several different spatial
>> datatypes
>>
>> (POINTSET, LINESET, POLYGON). These datatypes are typically not simpole,
>> even
>> the description of a point can be quite complex (see STC), not to mention
>>     
> a
>   
>> complex region. Of course these can be serialized into a string. But I
>> would not want to put the coordinates into "ra dec" strings.
>>     
>
> REGION is the base class of all these things you mention. If that is not the
>
> correct word (from STC or otherwise) we can pick a different one, but the 
> idea is that it is the base class, which makes it the most general and we
> can 
> do quite a lot now and in future without adding more reserved words (eg 
> without changing ADQL).
>
> Yes, they are complex (in STC) but I don't see any alternative to that. I
> have 
> not looked at STC closely and considered it more or less pre-ordained that
> we 
> should use whatever VO-standard coordinate system spec is available. I don't
>
> want to put a huge chunk of XML inside the REGION("...")... 
>
>   
>> Of course here Pat and Benjamin also want to
>> extend this to even more abstract concepts like time and energy intervals,
>> that none of the GIS systems do, although for intervals I think the
>>     
> BETWEEN
>   
>> clause (or several for a more complex interval set) might just do the job.
>>
>> One can then define various RELATIONS and various OPERATIONS between them.
>> The relations can be (CONTAINS, TOUCHES, DISJOINT, INTERSECT,...)
>> understood
>> as an enumerated return value from an operation between two different
>> spatial objects.
>>     
>
> Yes, all these operations are possible with geometry. However, I think 
> OVERLAPS comparison (aka INTERSECT) covers 95% of the uses without blocking 
> future enhancements (eg if a future version of ADQL introduces CONTAINS it 
> would keep OVERLAPS and the meaning would not change). I am happy enough to 
> re-use the INTERSECT reserved word if others are happy with that. 
>
>   
>> One can also have OPERATIONS among spatial objects, these are
>> (INTERSECTION, UNION, DIFFERENCE) which form a Boolean algebra, 
>> with some restrictions. These return another spatial object.
>>     
>
> IMO, too fancy for ADQL 1.0
>
>
>   
>> If we only restrict ourselves to POINTSETS (our catalogs) and POLYGONS
>>     
> (say
>   
>> =REGIONS) there are still many different things we might want to do. These
>> are all questions that the SDSS users have neen asking from the database
>>     
> as
>   
>> part of their research
>>
>> (1) Give me all the POINTS within a REGION from a certain set of tables
>>     
>
> SELECT * FROM <table>
> WHERE <position column> OVERLAPS REGION("...")
>
> I don't see any problem treating points as regions (with 0 area). Some 
> geometry libraries do this, others do not.
>
>   
>> (2) Given me all the POINTS which are within 10 arcsec to a REGION
>>     
> (errors)
>
> This is just a different region, the client software can (should) implement 
> this feature.
>
>   
>> (3) Tell me if this POINT is within this REGION
>>     
>
> I don't see how this differs from #1. Is it just that both args to OVERLAPS 
> are literals? That should be OK.
>
>   
>> (4) Which REGIONS in the database contain this POINT (is it in the photo
>> 	footprint but not in the spectro, for example)
>>     
>
> SELECT * FROM <table>
> WHERE <bounds column> OVERLAPS REGION("123 45 ICRS")
>
> (for example, I don't know if that is the correct string version of a
> point).
>
>   
>> (5) What is the distance of this point to the boundary
>>     
>
> IMO, too fancy for ADQL 1.0
>
>   
>> (6) What percent of this points 30" neighborhood is inside the survey
>> footprint
>>     
>
> IMO, too fancy for ADQL 1.0
>
>   
>> One can also think of storing REGION (POLYGON) data in the database, and
>> perform operations on those plus the incoming user defined regions. This
>>     
> is
>   
>> a
>> very complex task and to do this efficiently, one typically needs a binary
>> representation inside the DB, i.e. an object oriented or an object
>> relational DB. I do not want to go there, since my one page is up.
>>     
>
> I agree technically but conceptually it is quite straightforward to define
> the 
> basic comparison operator (overlaps) that is applicable to the base type 
> (region). That is what we have done and it covers ~95% of the ADQL use cases
>
> and is "future proof" in the sense that further advanced features do not
> make 
> it incorrect or out of place (at least, that is a concern and I think the 
> proposal satisfies it).
>
> Also, despite there being a column of type REGION, I don't envision people 
> actually storing regions in the DB... it is intended to be a logical 
> declaration only.
>
>   

================================================================================================
This message and any attachments are intended for the use of the addressee or addressees only. The
unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its content
is prohibited. If you received this message in error, please delete it from your system and notify
the sender. E-mails can be altered and their integrity cannot be guaranteed. ESA shall not be liable
for any e-mail if modified.
=================================================================================================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ADQL-20070419.pdf
Type: application/pdf
Size: 505852 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/voql-teg/attachments/20070509/7a822705/attachment-0001.pdf>


More information about the voql-teg mailing list