ADQL - adding LIMIT or FETCH

Grégory Mantelet gmantele at ari.uni-heidelberg.de
Mon Apr 11 17:44:20 CEST 2016


Dear Dave and DAL,

     I do not have any particular problem against OFFSET (though I think 
it is probably not the best solution for pagination over a long result 
if used for this purpose).

     However, I think that allowing both TOP and LIMIT (or FETCH) in an 
ADQL query is a bad idea. It just makes the syntax more disturbing for 
learners and particularly unclear when both are written in the same 
query. Even if, as you propose, there is a rule (whatever it is) for 
this last case, instinctively we will still have the doubt. Of course, 
usually nobody would write both in the same query, but that does not 
prevent ADQL learners/users to be confused when seeing in one query 
example TOP and in another one LIMIT (or FETCH).

     I also agree that removing TOP is not a good option either, or at 
least for a minor release of ADQL. But even for later version, in my 
opinion, TOP should still stay in ADQL for consistency and compatibility 
with former version. Although I quite agree it is a shame having it 
because most of the time it is interpreted as the "best" or first rows 
returned in a particular order ; LIMIT or FETCH would have probably been 
better, but I consider it too late to change the existing syntax.

     In brief, personally I am for keeping TOP alone and adding OFFSET. So:

         SELECT [TOP <m>]
         ...
         [OFFSET <n>]


     A last point: I have noticed that some of the syntaxes that you 
have mentioned impose to have TOP/LIMIT/FETCH in order to use OFFSET, 
other do not. Both make sense for me. But as a PostgreSQL user, I would 
choose the syntax with the most freedom: it should be possible to use 
OFFSET without a TOP.
     This choice might be however a little to tricky to solve for DBMS 
like MySQL which impose the usage of LIMIT in order to use OFFSET. This 
would not be a problem with Postgres which allows a special value for 
LIMIT: ALL. However it is not the case with MySQL. So I would propose as 
solution to set the LIMIT to either MAXREC or the maximum output limit 
of the TAP service...but it works only if the service has a such limit, 
which is mostly generally the case. Anyway, even knowing that, I think 
the syntax allowing the most freedom should be the chosen one: the 
possibility to use OFFSET without TOP. But I am not terribly against the 
strict usage of OFFSET only with TOP.

Grégory


On 04/11/2016 03:56 AM, Dave Morris wrote:
> Dear DAL,
>
> The consensus from the OFFSET discussion was that we should add 
> [OFFSET <n>] to the ADQL grammar.
>
> Should we also consider adding LIMIT or FETCH as well ?
>
> The reason for suggesting this is that it would bring ADQL in line 
> with the syntax used by the common database platforms and the majority 
> of the text books and reference manuals.
>
> I've looked at a number of different database platforms to see how 
> they handle row limits.
>
> Almost all of them use one of two forms, either LIMIT/OFFSET or 
> OFFSET/FETCH, both at the end of the query.
>
> Our current proposal, adding OFFSET to the existing ADQL grammar, 
> results in a composite form with TOP at the start of the query and 
> OFFSET at the end.
>
>     SELECT
>     [TOP <m>]
>     ....
>     [OFFSET <n>]
>
> Technically this works, but it does not match any of the text books or 
> reference manuals, making it just that little bit harder to learn.
>
> The suggestion is we adopt one or other of the common forms, either 
> LIMIT/OFFSET, or OFFSET/FETCH.
>
> Either one could help make learning ADQL that little bit easier for 
> our users.
>
> I am not proposing we remove TOP.
>
> TOP would stay part of the grammar, and the new term would be a synonym.
>
> If a query has both TOP and a limit, we could just use the smallest of 
> the two values and ignore the other, or we could flag it as an error 
> (SQLServer flags it as an error, I'm not sure what Sybase does).
>
> ----
>
> Of the platforms that I have looked at so far, five of them use a 
> variation of LIMIT and OFFSET :
>
>     PostgreSQL
>
>         [LIMIT <m>] [OFFSET <n>]
>
>     MySQL/MariaDB
>
>         [LIMIT <m> [OFFSET <n>]]
>
>     Sybase
>
>         [TOP <m>]
>         ....
>         [LIMIT <m> [OFFSET <n>]]
>
>     SQLite
>
>         [LIMIT <m> [OFFSET <n>]]
>
>     HyperSQL
>
>         [LIMIT <m>] [OFFSET <n>]
>
> Two of them follow the ISO SQL standard, and use a variation of OFFSET 
> and FETCH :
>
>     ISO SQL-2011 standard
>
>         [OFFSET <n> ROWS] [FETCH NEXT <m> ROWS]
>
>     SQLServer
>
>         TOP <m>
>         ....
>         [ORDER BY <s> [OFFSET <n> ROWS [FETCH NEXT <m> ROWS]]]
>
>     Derby
>
>         [OFFSET <n> ROWS] [FETCH FIRST <m> ROWS ONLY]
>
> Firebird has two forms, FIRST/SKIP and ROWS/TO, neither of which I 
> have seen anywhere else.
>
>         FIRST <m> SKIP <n>
>         ....
>         ROWS <n> TO <m+n>
>
> As far as I know, Oracle does not have terms for row limits and 
> offsets in the SQL language. However there are a number of 
> alternatives that give equivalent results using row number.
>
> ----
>
> The suggestion is we keep TOP at the start of a query, and add one or 
> other of the two forms at the end of the query.
>
> Either
>
>     SELECT
>     [TOP <m>]
>     ....
>     [LIMIT <m>] [OFFSET <n>]
>
> because it is the most common, or
>
>     SELECT
>     [TOP <m>]
>     ....
>     [OFFSET <n>] [FETCH <m>]
>
> because it is the SQL standard.
>
> Just enough so that it matches some of the reference manuals our users 
> are likely to find.
>
> https://www.google.co.uk/search?q=SQL+offset
>
> ----
>
> What do you think ?
>
> Is it worth adding a bit more complexity to our parsers to make the 
> language easier to learn, or will having two ways of expressing the 
> same thing cause more problems than it is worth ?
>
> Thanks,
> Dave
>
>
> --------
> Dave Morris
> Software Developer
> Wide Field Astronomy Unit
> Institute for Astronomy
> University of Edinburgh
> --------
>
>



More information about the dal mailing list