REGION

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Mon May 7 11:06:25 PDT 2007


Many small comments interspersed...

On Saturday 05 May 2007 00:40, Alex Szalay wrote:
> This is a good start, but I think that we need a much clearer focus.
> Also, after reading this I still feel confused what a REGION datatype is.
> I will try to keep my comments short.
>
> In a typical spatial framework there are several different spatial
> datatypes
>
> (POINTSET, LINESET, POLYGON). These datatypes are typically not simpole,
> even
> the description of a point can be quite complex (see STC), not to mention a
> complex region. Of course these can be serialized into a string. But I
> would not want to put the coordinates into "ra dec" strings.

REGION is the base class of all these things you mention. If that is not the 
correct word (from STC or otherwise) we can pick a different one, but the 
idea is that it is the base class, which makes it the most general and we can 
do quite a lot now and in future without adding more reserved words (eg 
without changing ADQL).

Yes, they are complex (in STC) but I don't see any alternative to that. I have 
not looked at STC closely and considered it more or less pre-ordained that we 
should use whatever VO-standard coordinate system spec is available. I don't 
want to put a huge chunk of XML inside the REGION("...")... 

> Of course here Pat and Benjamin also want to
> extend this to even more abstract concepts like time and energy intervals,
> that none of the GIS systems do, although for intervals I think the BETWEEN
> clause (or several for a more complex interval set) might just do the job.
>
> One can then define various RELATIONS and various OPERATIONS between them.
> The relations can be (CONTAINS, TOUCHES, DISJOINT, INTERSECT,...)
> understood
> as an enumerated return value from an operation between two different
> spatial objects.

Yes, all these operations are possible with geometry. However, I think 
OVERLAPS comparison (aka INTERSECT) covers 95% of the uses without blocking 
future enhancements (eg if a future version of ADQL introduces CONTAINS it 
would keep OVERLAPS and the meaning would not change). I am happy enough to 
re-use the INTERSECT reserved word if others are happy with that. 

> One can also have OPERATIONS among spatial objects, these are
> (INTERSECTION, UNION, DIFFERENCE) which form a Boolean algebra, 
> with some restrictions. These return another spatial object.

IMO, too fancy for ADQL 1.0


> If we only restrict ourselves to POINTSETS (our catalogs) and POLYGONS (say
> =REGIONS) there are still many different things we might want to do. These
> are all questions that the SDSS users have neen asking from the database as
> part of their research
>
> (1) Give me all the POINTS within a REGION from a certain set of tables

SELECT * FROM <table>
WHERE <position column> OVERLAPS REGION("...")

I don't see any problem treating points as regions (with 0 area). Some 
geometry libraries do this, others do not.

> (2) Given me all the POINTS which are within 10 arcsec to a REGION (errors)

This is just a different region, the client software can (should) implement 
this feature.

> (3) Tell me if this POINT is within this REGION

I don't see how this differs from #1. Is it just that both args to OVERLAPS 
are literals? That should be OK.

> (4) Which REGIONS in the database contain this POINT (is it in the photo
> 	footprint but not in the spectro, for example)

SELECT * FROM <table>
WHERE <bounds column> OVERLAPS REGION("123 45 ICRS")

(for example, I don't know if that is the correct string version of a point).

> (5) What is the distance of this point to the boundary

IMO, too fancy for ADQL 1.0

> (6) What percent of this points 30" neighborhood is inside the survey
> footprint

IMO, too fancy for ADQL 1.0

> One can also think of storing REGION (POLYGON) data in the database, and
> perform operations on those plus the incoming user defined regions. This is
> a
> very complex task and to do this efficiently, one typically needs a binary
> representation inside the DB, i.e. an object oriented or an object
> relational DB. I do not want to go there, since my one page is up.

I agree technically but conceptually it is quite straightforward to define the 
basic comparison operator (overlaps) that is applicable to the base type 
(region). That is what we have done and it covers ~95% of the ADQL use cases 
and is "future proof" in the sense that further advanced features do not make 
it incorrect or out of place (at least, that is a concern and I think the 
proposal satisfies it).

Also, despite there being a column of type REGION, I don't envision people 
actually storing regions in the DB... it is intended to be a logical 
declaration only.

-- 

Patrick Dowler
Tel/Tél: (250) 363-6914                  | fax/télécopieur: (250) 363-0045
Canadian Astronomy Data Centre   | Centre canadien de donnees astronomiques
National Research Council Canada | Conseil national de recherches Canada
Government of Canada                  | Gouvernement du Canada
5071 West Saanich Road               | 5071, chemin West Saanich
Victoria, BC                                  | Victoria (C.-B.)



More information about the voql-teg mailing list