ADQL evolution: OFFSET?
Theresa Dower
dower at stsci.edu
Mon Jan 11 20:01:42 CET 2016
All,
Hello! As a note, while newer versions of MSSQL Server handle OFFSET (and also require the order by, as a possible performance issue), I have found no evidence of support for LIMIT, only very ugly kludges that will require ADQL query rewriting. Since the kludges do exist, and are less bad than the issues we've had with NATURAL JOINs, I'm not going to argue too hard against this, but I did want to make the issue known.
--Theresa
-----Original Message-----
From: dal-bounces at ivoa.net [mailto:dal-bounces at ivoa.net] On Behalf Of François-Xavier Pineau
Sent: Monday, January 11, 2016 10:46 AM
To: dal at ivoa.net
Subject: Re: ADQL evolution: OFFSET?
Dear Markus, dear all,
Here after our (Laurent Michel and I) feelings about this.
On 01/11/2016 11:31 AM, Markus Demleitner wrote:
> Dear DAL list,
>
> One of the users of DaCHS' ADQL engine requested the addition of
> OFFSET, and indeed I think there's at least three good reasons to do
> so:
>
> (1) It aids harvesting, which in particular with obscore is IMHO a
> very nice thing to have
We are not DBMS experts but we are slightly concerned by the possible consequences of the usage of OFFSET to harvest large tables, e.g a catalogue like 2MASS.
First point (not specific to large tables) underlined by Gerard: to obtain consistent results, one have to use OFFSET together with ORDER BY (at least in PostgresSQL), see PSQL doc here:
http://www.postgresql.org/docs/9.5/static/queries-limit.html.
Then (specific to large tables), like it is also mentioned in the PSQL doc:
"The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient."
So using LIMIT, ORDER BY and OFFSET may not be the best way to harvest a large table (huge server load), except (maybe) if ORDER BY is made on an indexed column, without filtering rows (no 'WHERE'), and if the table is clustered according to the indexed column.
> (2) It enables pagination UIs bolted on top of an ADQL service
Except for large results (let's say > 100 000 rows for a web browser), the pagination can be made on the client side.
And above 100 000 rows, you will not look at all values but probably want to order then according to a given column or put additional or more restrictive constraints.
> (3) The OFFSET 0 hack that helps in several database systems to set a
> confused query planner right.
May OFFSET 0 confuse the query planner?
If not, I would vote for systematically add it on the server side (the user do not have to know about it, it is internal cuisine).
Having said this does not means that we are against its implementation in ADQL.
Cheers,
François-Xavier Pineau (CDS) and Laurent Michel (OAS)
>
> Given that, and given I've missed OFFSET even back in 2010 or so (when
> I wrote the example
> http://dc.g-vo.org/tap/examples#Crossmatchforaguidestar), I'm a bit
> surprised OFFSET hasn't been part of my TAP implementation notes.
>
> I don't remember why I dropped it -- perhaps because, equally
> surprisingly, OFFSET is not part of SQL92. Nevertheless, I'd like to
> propose the addition of (at first optional, mandatory starting 3.0)
> support for OFFSET in ADQL 2.1.
>
> This would entail giving a new rule
>
> <offset_spec> ::= "OFFSET" <unsigned_integer>
>
> and changing the query_specification production to
>
> <query_specification> ::=
> "SELECT" [ <set_quantifier> ] [ <set_limit> ]
> <select_list> <table_expression> [ <offset_spec> ]
>
> -- which would correspond to the postgres syntax of offset.
>
> Since it's not part of SQL 92, a bit of prose would be required in the
> body of the spec, which I'd be happy to contribute if people don't
> protest.
>
> But: How does everyone else feel about this? In particular: Is this
> an issue for non-postgres databases (and I'm mainly thinking of MySQL
> and Oracle here; SQLlite has the feature and even syntax as proposed
> here)?
>
> Cheers,
>
> Markus
--
François-Xavier Pineau
CDS, Observatoire Astronomique de Strasbourg 11, rue de l'Université F - 67000 Strasbourg
More information about the dal
mailing list