ADQL evolution: OFFSET?

gerard.lemson at gmail.com gerard.lemson at gmail.com
Mon Jan 11 15:43:46 CET 2016


Hi Markus et dal

> -----Original Message-----
> From: dal-bounces at ivoa.net [mailto:dal-bounces at ivoa.net] On Behalf Of 
> Markus Demleitner
> Sent: Monday, January 11, 2016 5:32 AM
> To: dal at ivoa.net
> Subject: ADQL evolution: OFFSET?
> 
> Dear DAL list,
> 
> One of the users of DaCHS' ADQL engine requested the addition of 
> OFFSET, and indeed I think there's at least three good reasons to do
> so:
> >...
> This would entail giving a new rule
> 
>   <offset_spec> ::= "OFFSET" <unsigned_integer>
> 
> and changing the query_specification production to
> 
>   <query_specification> ::=
>     "SELECT" [ <set_quantifier> ] [ <set_limit> ]
>       <select_list> <table_expression> [ <offset_spec> ]
> 
> -- which would correspond to the postgres syntax of offset.
> 
> Since it's not part of SQL 92, a bit of prose would be required in the 
> body of the spec, which I'd be happy to contribute if people don't protest.
> 
> But: How does everyone else feel about this?  In particular: Is this 
> an issue for non-postgres databases (and I'm mainly thinking of MySQL 
> and Oracle here; SQLlite has the feature and even syntax as proposed here)?
> 
Thanks for suggesting this, as it inspired me to check up on sqlserver where I found it has had OFFSET since version 2012.
Syntax is (see https://technet.microsoft.com/en-us/library/gg699618(v=sql.110).aspx):

[ORDER BY { order_by_expression [ ASC | DESC ] } [ ,...n][<offset_fetch>] ]

<offset_fetch> ::= {OFFSET { integer_constant | offset_row_count_expression } { ROW | ROWS }    [FETCH { FIRST | NEXT } {integer_constant | fetch_row_count_expression } { ROW | ROWS } ONLY]}

...

Compared to your proposal there is the ROW(S) clause (which an ADQL parser can easily add) and the FETCH (which allows pagination) which I suppose has LIMIT in postgres(?).

Most important difference is the requirement that it be part of an ORDER BY clause, which I think makes sense as without that you cannot in general rely on consistent ordering between consecutive requests. Would that be OK with you, all? Does it fit with other DB implementations? 
It seems postgres does not mandate that, but I suppose it is ok with it. Indeed http://www.postgresql.org/docs/8.0/static/queries-limit.html says " When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows. You may be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless you specified ORDER BY." So I guess ADQL mandating it would be good and not block SQLServer from implementing it.

C.c.: 
For SQLServer-based implementations of ADQL (which unarguably would be a good think to have, see SDSS, STScI, Millennium, others?) the main stumbling block is I believe (still?) support for the spatial extensions. Maybe this is a good forum as any to try to get some support via mailing list iso ad hoc questions during interop meeting to get a group of interested people together to work on this. If more are interested we can post a more formal email to the list starting a discussion.

Cheers

Gerard







> Cheers,
> 
>           Markus




More information about the dal mailing list