ADQL grammar validation

Tue Apr 18 20:08:02 CEST 2017

Hi Dave,

     As you, following the discussions at the ASTERICS meeting in 
Strasbourg, I have already worked quite intensively of this topic for a 
week. I was hoping to have some time this week to report all my 
explorations and ideas in a more clean way, but I will try to sumarize 
my thoughts now anyway. After a brief reading of your observations on 
lyonetia 
<https://github.com/ivoa/lyonetia/wiki/BNF-grammar-and-validation>, I 
agree with most of your ideas: my conclusion was also to start from 
scratch a new ADQL grammar based on what all current ADQL parsers in the 
VO are already doing and what is defined in the ADQL document. Below I 
give some of my conclusions and also a suggestion of grammar notation to 
use (for those who do not like suspens, I jump quickly to this 
suggestion: PEG).

/FYI, since this email is quite long, here is a preview of its content://

//    A. Few comments about the current ADQL grammar//
//    B. "Toy grammar//"
//    C. Brief comparison of these notations//
//    D. Testing PEG with the "toy grammar"//
/

----------------------------------------------------------------------------
  A. Few comments about the current ADQL grammar
----------------------------------------------------------------------------

About the current ADQL grammar (the attached file "adql-v0.bnf"), I have 
listed the following errors/comments:

* Do not define a rule with just ONE terminal element (e.g. `<ampersand>`,
    `<asterisk>`) ; a rule defines a syntax not a token/terminal

* Lack of required and optional space characters (or comment) between
    tokens and literals. Without this specification,
    "SELECTmycolumnFROMmytable" would be entirely possible though
    we perfectly know it is not. On the other hand, writing "2  * mycolumn"
    or "2*mycolumn" is completely correct.

* Missing a syntax to represent literal values (e.g. "SELECT", letters and
   digits, operators) ; because of that, there is a possible ambiguity 
between
   BNF syntax symbols and grammar literals

* l.165 - `<boolean_primary>` : two lines inverted ; a rule can not 
start with
   "|" ; line 166 and 167 must be inverted

* l.257 - `<default_function_prefix>` : empty definition ; the comment says
   the default prefix should be "udf_" but this rule set nothing ; 
should it be
   an empty string (i.e. "") or "udf_"?

* l.420 - `<newline>` : empty definition ; the literal should probably 
be "\n"

* l.436 - `<nondoublequote_character>` - empty definition : should it be a
   regular expression such as `[^"]`?

* l.439 - `<nonquote_character>` : empty definition ; should it be a regular
   expression such as `[^']`?

* l.642 - `<unqualified_schema name>` : a space character in the rule name ;
   should it be allowed?

* l.592 - `<space>` : empty definition ; should it be " " or should it also
   include "\t", "\n" and "\r"? (here comes the literal ambiguity mentioned
   above)

* Ambiguity with the `AS` keyword ; because it is optional, depending of
   the parser the following element may always be interpreted as an
   identifier for the alias, even though it is `FROM` or `(*)` (after a
   `COUNT` for example)

---------------------------
  B. "Toy grammar"
---------------------------

     During my explorations on the different language notations we could 
use, I have defined a very reduced subset of ADQL/SQL from scratch.
         - just SELECT, FROM (without JOINs), WHERE (limited to simple 
comparisons only) and ORDER BY are defined
         - mathematical expressions are just limited to numbers (integer 
and double) with no parenthesis and no functions
         - no geometrical operations
         - an ADQL query MUST end with a semi-colon /(see below why)/

      With this subset, it is not intended to match what we want exactly 
in the ADQL grammar (few syntax elements could be written in a nicer 
way) ; it is just a grammar for testing the same ADQL/SQL syntax with 
different notations in order to see the advantages and drawbacks of each 
of them.

     I have attached this same grammar in the following notations:
         * BNF (as originally defined): "adql_min.bnf"
         * EBNF: "adql_min.ebnf"
         * ABNF: "adql_min.abnf"
         * "IVOA-BNF" (the BNF derived from SQL-92 and used in the ADQL 
1.0 document): "adql_min.ivoa_bnf"
         * PEG: "adql_min.peg".

     I wrote them so that they can be compared side by side, with for 
instance a visual diff. tool or by putting them manually side by side on 
your screen. That way, you can see their difference.

--------------------------------------------------------
  C. Brief comparison of these notations
--------------------------------------------------------

     My preference goes for PEG and alternatively ABNF. Here is why.

/(warning: maybe boring technical comments incoming... :-) )/

     - According to me, the original BNF notation 
<https://web.archive.org/web/20060925132043/https://www.lrz-muenchen.de/%7Ebernhard/Algol-BNF.html> 
suffers too much of ambiguity: no delimitation for literals (e.g. how to 
know that | is part of the BNF syntax or the '|' character, how to 
specify space characters) and the lack of grouping and repetition 
syntaxes (which can be done differently ; see "adql_min.bnf" for 
examples). Both points are fixed in EBNF and ABNF and makes the grammar 
reading more confortable and compact.

     - The original SQL92 BNF syntax is actually a mix of EBNF, ABNF and 
the addition of the syntax "..." (which seems to be used for 
repetition...but which one: 0..n or 1..n? there is no clarification 
about that). It seems this syntax is defined nowhere (and trust me, I 
have searched a lot for this). I think it is definitely not a good idea 
to keep this very custom notation.

/Note: By the way, here is where the original SQL-92 seems to come from: 
//https://github.com/ronsavage/SQL//; there is also a lex-yacc + perl 
script to parse and validate //this special BNF (I did not succeed to 
make it work on our ADQL BNF, but I admit that I have not insisted too 
much, so feel free to play with it if you want and if you are more 
familiar with perl and yacc than me...which should not be difficult)./

     - On the contrary to the original BNF and this "SQL-92 BNF" we are 
using, there is a standard/normative specification for EBNF 
<https://www.cl.cam.ac.uk/%7Emgk25/iso-14977.pdf> and ABNF 
<http://tools.ietf.org/html/rfc5234>. I have a slight preference for 
ABNF because:
         1. the token separator is a space (instead of a comma in EBNF).
         2. it is possible to use character ranges like '%x41-5A' (i.e. 
'a-z' in hexadecimal).
Then, I have to admit that prefixing a group with * in ABNF instead of 
suffixing it with * in EBNF for a repetition is a bit disturbing, but at 
least, with ABNF, you can specify a range limit (e.g. '2*5(...)' to 
repeat the group at least 2 and at most 5 times) ; then, it depends of 
whether or not we need that.

     - Finally, as I said above, my strong preference goes to PEG: 
Parsing Expression Grammar (reference paper: 
http://bford.info/pub/lang/peg). It is a quite new notation which is not 
so far from ABNF. The evaluation of such expression is performed 
differently from a BNF or any other similar notation (lex+yacc, javacc, 
...). It has been designed to be a non-ambiguous notation. For more 
details, I invite you to look at the very short abstract and then the 
paper if you want to investigate more into details.

To come back quickly on the required and optional spaces issue, I also 
completely agree with you, Dave: it was actually the first issue I had 
when I tried to validate our ADQL BNF. In PEG, I solved this issue by 
declaring two rules:
         - '_' for optional spaces, like between a numeric expression 
and a mathematical operator (e.g. '2*3')
         - '__' for required spaces, like between 'SELECT' and its items
I think these rule names (pretty common in PEG) does not spoil the 
reading of the grammar while imposing or not the presence of a space 
characters (or comment).

On the contrary to the various BNF described above, PEG allows the usage 
of regular expression (but I would recommend to use them with 
moderation...they should not replace grammar rules).

As with common parser-parser grammars, it is possible to use 
LOOKAHEAD-like expressions (i.e. "try this set of rules and only if it 
fails/succeeds goes on"). This is not possible with BNFs. This makes 
look like a hack, but unfortunately, sometimes, when the grammar becomes 
complicated and full of possibilities, there is no other choice....even 
in my ADQL/SQL subset (for instance, see the rule "regular_identifier", 
l.54 in "adql_min.peg").

     Anyway, the main reason why I prefer PEG to the BNF notations, is 
because it is much more easy to find a parser in a lot of different 
languages (C/C++/C#, Java, Python, Javascript, ... ; example of parsers: 
packrat <http://bford.info/packrat/>, mouse 
<http://www.romanredz.se/Mouse/index.htm>, waxeye <https://waxeye.org/>, 
...).

---------------------------------------------------------
  D. Testing PEG with the "toy grammar"
---------------------------------------------------------

     The first parser I discovered was PEGjs (Javascript): 
https://pegjs.org/online. Nice thing about it: you can edit the grammar 
online and directly evaluate your expression. I invite you to try it 
with the attached grammar "adql_min.pegjs" and the attached test queries 
"test-queries". Copy-paste the content of the first one in the grammar 
part (left) and the second one in the input part (top right).

/Note: for the purpose of this exploration, I require to end ADQL 
queries with a semi-colon so that I can chain and test several queries 
in a row.

/The grammar "adql_min.peg" and "adql_min.pegjs" declare exactly the 
same syntactic rules. The only differences between them are just for 
PEGjs to know how to build the parser. Here are these differences:

     1. replacement of the affectation symbol "<-" by "="...this is 
required by PEGjs
/-> that can be easily done by a small script (e.g. the unix command 'sed')/

     2. addition of group labels and javascript snippets. These are just 
to build the final parser so that it does what we want. In this case, I 
wanted to list SELECT, FROM, WHERE and ORDER BY content /(which you can 
see in the bottom right part of the PEGjs online validator)/. Even 
without these labels and snippets, PEGjs is still able to check your 
input queries but the output would be just quite useless.
/-> //this step of the validation could be done in another way 
automatically but a bit more time and development would be 
required...not sure I will have time before the Interop. in Shanghai./

Nevertheless, a small drawback to know about PEG is that left recursion 
is not allowed. Small re-writing of the ADQL grammar (e.g. numerical and 
boolean operations) would be needed. But as far as I know, few parsers 
are allowing left recursion so most of the existing ADQL parsers may 
have already rewritten such special syntaxes, but please, tell me if I 
am wrong here.
/(for those not familiar with language parsing, just compare side by 
side "adql_min.bnf" and "adql_min.peg", l.62)
/

------------------------------------------------------

Finally, I would say that with PEG and especially with PEGjs, we could 
probably design quickly a parser for our ADQL grammar and from that we 
could also create an ADQL parser. Ideally, if that concept is confirmed 
and nice error messages are possible (it is still the tricky part of 
such parsers), I even think to eventually use a PEG parser for my ADQL 
Library so that sparing me the time to maintain my JavaCC grammar 
(written from scratch from the ADQL BNF) while having the guarantee that 
the parser is following the official ADQL grammar. But of course, with 
PEGjs, it would more particularly possible to now have easily a 
Javascript ADQL parser in VO web interfaces for TAP.

Sorry for this very loooong email (congratulations if you reached the 
end ;-)) and I hope all these inputs will be however helpful.

Cheers,
Grégory

On 04/18/2017 04:51 PM, Dave Morris wrote:
> Hi DAL,
>
> Progress on developing a method for validating the BNF grammar for 
> ADQL has been stalled for a while.
>
> Following on from discussions at the ASTERICS meeting in Strasbourg, I 
> have been reading up on language grammars and BNF. Based on what I 
> learned I've put together a proposal for how to go about creating a 
> machine readable language grammar for ADQL.
>
> Short form is :- rather than trying to patch fix the existing BNF 
> grammar we create a new grammar from scratch, adding each of the 
> language rules and validating them one at a time.
>
> More detail is available on the project wiki.
> https://github.com/ivoa/lyonetia/wiki/BNF-grammar-and-validation
>
> We have found a tool that can validate a BNF (or EBNF) grammar against 
> a set of examples:
> http://bnfparser2.sourceforge.net/
>
> And we have set up a prototype system in a Docker container for 
> developing and validating a grammar for ADQL:
> https://github.com/ivoa/lyonetia/tree/master/src/docker/bnfparser
>
> --------
>
> I'd be interested to hear what you think.
>
> Is this a realistic approach to creating a machine readable grammar 
> for ADQL ?
>
> Do you have an alternative method you would like to suggest ?
>
> At this stage we would welcome any and all suggestions on how best to 
> create a machine readable grammar and how to implement the validation 
> process.
>
> Thanks,
> Dave
>
> --------
> Dave Morris
> Research Software Engineer
> Wide Field Astronomy Unit
> Institute for Astronomy
> University of Edinburgh
> --------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20170418/7f8cde4d/attachment-0001.html>
-------------- next part --------------

query = select from [order-by]

select = "SELECT" [select-quantifier] [select-limit] select-list

select-quantifier = "DISTINCT" / "ALL"

select-limit = "TOP" unsigned-integer

select-list = select-item *("," select-item)

select-item = select-column [["AS"] identifier]

select-column =   "*"
                / identifier
                / numeric-expression

from = "FROM" from-list

from-list = table-item *("," table-item)

table-item = identifier [["AS"] identifier]

order-by = "ORDER" "BY" order-by-list

order-by-list = order-by-item *("," order-by-item)

order-by-item =   identifier [order-by-sort]

                / unsigned-integer [order-by-sort]

order-by-sort = "ASC" / "DESC"

identifier =   delimited-identifier
             / regular-identifier

delimited-identifier = %x22 not-dbl-quote-string %x22

not-dbl-quote-string =   not-dbl-quote-char
                       / not-dbl-quote-string not-dbl-quote-char

regular-identifier =   letter
                     / regular-identifier letter
                     / regular-identifier digit
                     / regular-identifier _

comment = "--" *(ANY-CHAR)

numeric-expression = term *(("+" / "-") numeric-expression)

term = factor *(("+" / "-") term)

factor = ["-" / "+"] numeric-primary

numeric-primary =   numeric-value
                  | identifier

numeric-value =   scientific-number
                / decimal
                / unsigned-integer

scientific-number = (decimal / unsigned-integer)
                    "E" ["+" / "-"] unsigned-integer

decimal = unsigned-integer "." *(digit)

unsigned-integer = digit *(digit)

digit = %x30-39  ; i.e. 0-9

letter = %x41-5A / %x61-7A  ; i.e. a-z or A-Z

reserved-keywords =   "SELECT" / "FROM" / "ORDER BY"
                    / "DISTINCT" / "ALL" / "TOP"
                    / "AS" / "ASC" / "DESC"

not-dbl-quote-char =   letter / digit
                     / %x20 / %x09 / %x5C.22 / %x27 / "," / ";" / "." 

ANY-CHAR = letter / digit / %x20 / %x09 / %x22 / %x27 / "," / ";" / "."
-------------- next part --------------

<query> ::=   <select> <from> <order_by>
            | <select> <from>

<select> ::=   SELECT <select_quantifier> <select_limit> <select_list>
             | SELECT <select_quantifier> <select_list>
             | SELECT <select_limit> <select_list>
             | SELECT <select_list>

<select_quantifier> ::= DISTINCT | ALL

<select_limit> ::= TOP <unsigned_integer>

<select_list> ::=   <select_item>
                  | <select_list> , <select_item>

<select_item> ::=   <select_column> AS <identifier>
                  | <select_column> <identifier>
                  | <select_column>

<select_column> ::=   *
                    | <identifier>
                    | <numeric_expression>

<from> ::= FROM <from_list>

<from_list> ::=   <table_item>
                | <from_list> , <table_item>

<table_item> ::=   <identifier> AS <identifier>
                 | <identifier> <identifier>
				 | <identifier>

<order_by> ::= ORDER BY <order_by_list>

<order_by_list> ::=   <order_by_item>
					| <order_by_list> , <order_by_item>

<order_by_item> ::=   <identifier> <order_by_sort>
                    | <identifier>
                    | <unsigned_integer> <order_by_sort>
					| <unsigned_integer>

<order_by_sort> ::= ASC | DESC

<identifier> ::=   <delimited_identifier>
                 | <regular_identifier>

<delimited_identifier> ::= " <not_dbl_quote_string> "

<not_dbl_quote_string> ::=   <not_dbl_quote_char>
                           | <not_dbl_quote_string> <not_dbl_quote_char>

<regular_identifier> ::=   <letter>
                         | <regular_identifier> <letter>
                         | <regular_identifier> <digit>
                         | <regular_identifier> _

<comment> ::= -- <ANY_CHAR>
              | <comment> <ANY_CHAR>

<numeric_expression> ::=   <term>
                         | <numeric_expression> + <term>
						 | <numeric_expression> - <term>

<term> ::=   <factor>
           | <term> * <factor>
		   | <term> / <factor>

<factor> ::=   <numeric_primary>
             | - <numeric_primary>
			 | + <numeric_primary>

<numeric_primary> ::=   <numeric_value>
                      | <identifier>

<numeric_value> ::=   <scientific_number>
                    | <decimal>
					| <unsigned_integer>

<scientific_number> ::=   <decimal> E <unsigned_integer>
                        | <decimal> E- <unsigned_integer>
						| <decimal> E+ <unsigned_integer>
                        | <unsigned_integer> E <unsigned_integer>
						| <unsigned_integer> E- <unsigned_integer>
						| <unsigned_integer> E+ <unsigned_integer>

<decimal> ::=   <unsigned_integer> .
              | <decimal> <digit>

<unsigned_integer> ::= <digit> | <unsigned_integer> <digit>

<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

<letter> ::=   a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q
             | r | s | t | u | v | w | x | y | z
			 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q
			 | R | S | T | U | V | W | X | Y | Z

<reserved_keywords> ::=   SELECT | FROM | ORDER BY
                        | DISTINCT | ALL | TOP
						| AS | ASC | DESC

<not_dbl_quote_char> ::=   <letter> | <digit>
                         |   |      | \" | ' | , | ; | . 

<ANY_CHAR> ::= <letter> | <digit> |   | 	 | " | ' | , | ; | .
-------------- next part --------------

query = select , from , [order by];

select = "SELECT" , [select quantifier] , [select limit] , select list;

select quantifier = "DISTINCT" | "ALL";

select limit = "TOP" , unsigned integer;

select list = select item , {"," , select item};

select item = select column , [["AS"] , identifier];

select column =   "*"
                | identifier
                | numeric expression;

from = "FROM" , from list;

from list = table item , {"," , table item};

table item = identifier , [["AS"] , identifier];

order by = "ORDER" , "BY" , order by list;

order by list = order by item , {"," , order by item};

order by item =   identifier , [order by sort]

                | unsigned integer , [order by sort];

order by sort = "ASC" | "DESC";

identifier =   delimited identifier
             | regular identifier;

delimited identifier = '"' , not dbl quote string , '"';

not dbl quote string =   not dbl quote char
                       | not dbl quote string , not dbl quote char;

regular identifier =   letter
                     | regular identifier , letter
                     | regular identifier , digit
                     | regular identifier , "_";

comment = "--" , {ANY CHAR};

numeric expression = term , [("+" | "-") , numeric expression];

term = factor , [("*" | "/") , term];

factor = ["-" | "+"] , numeric primary;

numeric primary =   numeric value
                  | identifier;

numeric value =   scientific number
                | decimal
                | unsigned integer;

scientific number = (decimal | unsigned integer)
                    , "E" , ["+" | "-"] , unsigned integer;

decimal = unsigned integer , "." , {digit};

unsigned integer = digit , {digit};

digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9";

letter =   "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k"
         | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v"
         | "w" | "x" | "y" | "z"
		 | "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K"
         | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V"
         | "W" | "X" | "Y" | "Z";

reserved keywords =   "SELECT" | "FROM" | "ORDER BY"
                    | "DISTINCT" | "ALL" | "TOP"
                    | "AS" | "ASC" | "DESC";

not dbl quote char =   letter | digit
                     | " " | "\t" | '\\"' | "'" | "," | ";" | ".";

ANY CHAR = letter | digit | " " | "\t" | '"' | "'" | "," | ";" | ".";
-------------- next part --------------

<query> ::=   <select> <from> [<order_by>]

<select> ::=   SELECT [<select_quantifier>] [<select_limit>] <select_list>

<select_quantifier> ::= DISTINCT | ALL

<select_limit> ::= TOP <unsigned_integer>

<select_list> ::=   <select_item> [{ , <select_item> }...]

<select_item> ::=   <select_column> [[AS] <identifier>]

<select_column> ::=   *
                    | <identifier>
                    | <numeric_expression>

<from> ::= FROM <from_list>

<from_list> ::=   <table_item> [{ , <table_item> }...]

<table_item> ::=   <identifier> [[AS] <identifier>]

<order_by> ::= ORDER BY <order_by_list>

<order_by_list> ::=   <order_by_item> [{ , <order_by_item> }...]

<order_by_item> ::=   <identifier> [<order_by_sort>]

                    | <unsigned_integer> [<order_by_sort>]

<order_by_sort> ::= ASC | DESC

<identifier> ::=   <delimited_identifier>
                 | <regular_identifier>

<delimited_identifier> ::= " <not_dbl_quote_char>... "

<regular_identifier> ::=   <letter> [{<letter> | <digit> | _}...]

<comment> ::= -- [<ANY_CHAR>...]

<numeric_expression> ::=   <term>
                         | <numeric_expression> + <term>
                         | <numeric_expression> - <term>

<term> ::=   <factor>
           | <term> * <factor>
           | <term> / <factor>

<factor> ::=   <numeric_primary>
             | - <numeric_primary>
             | + <numeric_primary>

<numeric_primary> ::=   <numeric_value>
                      | <identifier>

<numeric_value> ::=   <scientific_number>
                    | <decimal>
					| <unsigned_integer>

<scientific_number> ::= { <decimal> | <unsigned_integer> }
                        E [+ | -] <unsigned_integer>

<decimal> ::=   <unsigned_integer> . [<digit>...]

<unsigned_integer> ::= <digit> [<digit>...]

<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

<letter> ::=   a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q
             | r | s | t | u | v | w | x | y | z
			 | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q
			 | R | S | T | U | V | W | X | Y | Z

<reserved_keywords> ::=   SELECT | FROM | ORDER BY
                        | DISTINCT | ALL | TOP
						| AS | ASC | DESC

<not_dbl_quote_char> ::=   <letter> | <digit>
                         |   |      | \" | ' | , | ; | .

<ANY_CHAR> ::= <letter> | <digit> |   | 	 | , | ; | .
-------------- next part --------------

query <- _ select __ from (__ order_by)? _

select <- "SELECT" (__ select_quantifier)? (__ select_limit)? __ select_list

select_quantifier <- "DISTINCT" / "ALL"

select_limit <- "TOP" __ unsigned_integer

select_list <- select_item (_ "," _ select_item)*

select_item <- select_column (__ ("AS" __)? identifier)?

select_column <-   "*"
                 / identifier
                 / numeric_expression

from <- "FROM" __ from_list

from_list <- table_item (_ "," _ table_item)*

table_item <- identifier (__ ("AS" __)? identifier)?

order_by <- "ORDER" __ "BY" __ order_by_list

order_by_list <- order_by_item (_ "," _ order_by_item)*

order_by_item <-   identifier (__ order_by_sort)?

                 / unsigned_integer (__ order_by_sort)?

order_by_sort <- "ASC" / "DESC"

identifier <-   delimited_identifier
              / regular_identifier

delimited_identifier <- '"' ('\\"' / [^"])* '"'

regular_identifier <- !(reserved_keywords) letter (letter / digit / "_")*

comment <- "--" [^\n\r]*

numeric_expression <- term (_ ("+" / "-") _ numeric_expression)?

term <- factor (_ ("*" / "/") _ term)?

factor <- ("+" / "-")? numeric_primary

numeric_primary <- numeric_value / identifier

numeric_value <-   scientific_number
                 / decimal
                 / unsigned_integer

scientific_number <- (decimal / unsigned_integer)
                     "E" ("+" / "-")? unsigned_integer

decimal <-   unsigned_integer "." digit*

unsigned_integer <- digit (digit)*

digit <- [0-9]

letter <- [a-zA-Z]

reserved_keywords <-   "SELECT" / "FROM" / "ORDER" __ "BY"
                     / "DISTINCT" / "ALL" / "TOP"
                     / "AS" / "ASC" / "DESC"

ANY_CHAR <- letter / digit / " " / "\t" / "," / ";" / "."

_ <- (Space / comment)*
__ <- (Space / comment)+
Space <- ' ' / '\t' / EndOfLine
EndOfLine <- '\r\n' / '\n' / '\r'
-------------- next part --------------

queries = LIST:(ITEM:query _ ";" {return ITEM;})*
          {return LIST.join('\n####################\n')};

query = _ S:select __ F:from ORDER:(__ O:order_by {return O;})? _
        {return S+"\n\n"+F+(ORDER ? "\n\n"+ORDER : "");};

select = "SELECT"i Q:(__ TMP:select_quantifier {return TMP;})? L:(__ TMP:select_limit {return TMP;})? __ LIST:select_list
         {return "SELECT" + (Q ? " ("+Q+")" : "") + (L ? " ("+L+")" : "") + ":\n" + LIST;};

select_quantifier =   "DISTINCT"i {return "Distinct rows";}
                    / "ALL" {return "All rows";};

select_limit = "TOP"i __ INT:unsigned_integer
               {return "Limit: "+INT;};

select_list = FIRST:select_item NEXT:(_ "," _ ITEM:select_item {return ITEM;})*
              {return "    * "+FIRST+(NEXT.length > 0 ? "\n    * "+NEXT.join('\n    * ') : "");};

select_item = C:select_column ALIAS:(__ ("AS"i __)? A:identifier {return A;})?
              {return C+(ALIAS ? " (alias: "+ALIAS+")" : "");};

select_column =   "*"
                 / numeric_expression;

from = "FROM"i __ LIST:from_list
       {return "FROM:\n"+LIST;};

from_list = FIRST:table_item NEXT:(_ "," _ ITEM:table_item {return ITEM;})*
            {return "    * "+FIRST+(NEXT.length > 0 ? "\n    * "+NEXT.join('\n    * ') : "");};

table_item = T:identifier ALIAS:(__ ("AS"i __)? A:identifier {return A;})?
             {return T+(ALIAS != null ? " (alias: "+ALIAS+")" : "");};

order_by = "ORDER"i __ "BY"i __ LIST:order_by_list
           {return "ORDER BY:\n"+LIST;};

order_by_list = FIRST:order_by_item NEXT:(_ "," _ ITEM:order_by_item {return ITEM;})*
                {return "    * "+FIRST+(NEXT.length > 0 ? "\n    * "+NEXT.join('\n    * ') : "");};

order_by_item =    C:identifier SORT:(__ S:order_by_sort {return S;})?
                   {return (SORT == 'DESC' ? "Descending " : "Ascending ")+"COLUMN{"+C+"}";}

                 / INDEX:unsigned_integer SORT:(__ S:order_by_sort {return S;})?
                   {return (SORT == 'DESC' ? "Descending " : "Ascending ")+"COL-INDEX{"+INDEX+"}";};

order_by_sort = S:("ASC"i / "DESC"i)
                {return S;};

identifier = ID:(  delimited_identifier
                 / regular_identifier)
             {return "Identifier{"+ID+"}";};

delimited_identifier = '"' LETTERS:('\\"' {return '\\\\"';} / [^"])* '"'
                       {return '"'+LETTERS.join('')+'"';};

regular_identifier = !(reserved_keywords) FIRST:letter NEXT:(letter / digit / "_")*
                     {return FIRST+NEXT.join('');};

comment = "--" [^\n\r]* {return null;};

numeric_expression = FIRST:term NEXT:(_ OP:("+" / "-") _ EXP:numeric_expression {return [OP,EXP];})?
                     {return (NEXT ? "Numeric{"+NEXT[0]+"("+FIRST+","+NEXT[1]+")}" : FIRST);};

term = FIRST:factor NEXT:(_ OP:("*" / "/") _ TERM:term {return [OP, TERM];})?
       {return (NEXT ? NEXT[0]+"("+FIRST+","+NEXT[1]+")" : FIRST);};

factor = SIGN:("+" / "-")? NUM:numeric_primary
         {return (SIGN ? SIGN+"("+NUM+")" : NUM);};

numeric_primary = numeric_value / identifier;

numeric_value =   scientific_number
                 / decimal
                 / unsigned_integer;

scientific_number = NUMBER:(decimal / unsigned_integer)
                    "E" SIGN:("+" {return 1;}/ "-" {return -1;})? EXP:unsigned_integer
                    {return NUMBER*Math.pow(10, (SIGN ? SIGN : 1)*EXP);};

decimal = INT:unsigned_integer "." DEC:digit*
          {return INT+parseFloat("0."+DEC.join(''));};

unsigned_integer = FIRST:digit NEXT:(digit)* {return parseInt(FIRST+NEXT.join(''));};

digit = [0-9];

letter = [a-zA-Z];

reserved_keywords =   "SELECT"i / "FROM"i / "ORDER"i __ "BY"i
                     / "DISTINCT"i / "ALL"i / "TOP"i
                     / "AS"i / "ASC"i / "DESC"i;

ANY_CHAR = letter / digit / " " / "\t" / "," / ";" / ".";

_ = (Space / comment)* { return null; };
__ = (Space / comment)+ { return null; };
Space = ' ' / '\t' / EndOfLine;
EndOfLine = '\r\n' / '\n' / '\r';
-------------- next part --------------
-- the simplest query ever
SELECT * FROM myTable;

-- comments and line returns must be supported
SELECT id, "ra", dec -- inner comment
FROM myTable;

-- multiple tables with and without aliases
SELECT *
FROM table1 AS t1, table2 t2, table3;

-- with column names and order by
SELECT id, type, date
FROM myTable
ORDER BY date DESC, 2 ASC;

-- with DISTINCT
SELECT DISTINCT type
FROM myTable;

-- with TOP
SELECT TOP 5 *
FROM myTable;

-- with numeric and (basic) operations
SELECT id "anID", val1+5/89 - 3 AS anOperation
FROM myTable;
-------------- next part --------------
<ADQL_language_character> ::=
    <simple_Latin_letter>
  | <digit>
  | <SQL_special_character>

<ADQL_reserved_word> ::=
    ABS 
  | ACOS 
  | AREA    
  | ASIN 
  | ATAN 
  | ATAN2 
  | BOX
  | CEILING 
  | CENTROID
  | CIRCLE
  | CONTAINS
  | COORD1
  | COORD2
  | COORDSYS
  | COS 
  | DEGREES 
  | DISTANCE
  | EXP 
  | FLOOR 
  | INTERSECTS
  | LOG 
  | LOG10 
  | MOD 
  | PI 
  | POINT
  | POLYGON
  | POWER 
  | RADIANS 
  | REGION 
  | RAND 
  | ROUND 
  | SIN 
  | SQRT 
  | TOP
  | TAN 
  | TRUNCATE

<SQL_embedded_language_character> ::=
    <left_bracket>
  | <right_bracket>

<SQL_reserved_word> ::=
   ABSOLUTE | ACTION | ADD | ALL
 | ALLOCATE | ALTER | AND
 | ANY | ARE
 | AS | ASC
 | ASSERTION | AT
 | AUTHORIZATION | AVG
 | BEGIN | BETWEEN | BIT | BIT_LENGTH
 | BOTH | BY
 | CASCADE | CASCADED | CASE | CAST
 | CATALOG
 | CHAR | CHARACTER | CHAR_LENGTH
 | CHARACTER_LENGTH | CHECK | CLOSE | COALESCE
 | COLLATE | COLLATION
 | COLUMN | COMMIT
 | CONNECT
 | CONNECTION | CONSTRAINT
 | CONSTRAINTS | CONTINUE
 | CONVERT | CORRESPONDING | COUNT | CREATE | CROSS
 | CURRENT
 | CURRENT_DATE | CURRENT_TIME
 | CURRENT_TIMESTAMP | CURRENT_USER | CURSOR
 | DATE | DAY | DEALLOCATE 
 | DECIMAL | DECLARE | DEFAULT | DEFERRABLE
 | DEFERRED | DELETE | DESC | DESCRIBE | DESCRIPTOR
 | DIAGNOSTICS
 | DISCONNECT | DISTINCT | DOMAIN | DOUBLE | DROP
 | ELSE | END | END-EXEC | ESCAPE
 | EXCEPT | EXCEPTION
 | EXEC | EXECUTE | EXISTS
 | EXTERNAL | EXTRACT
 | FALSE | FETCH | FIRST | FLOAT | FOR
 | FOREIGN | FOUND | FROM | FULL
 | GET | GLOBAL | GO | GOTO
 | GRANT | GROUP
 | HAVING | HOUR
 | IDENTITY | IMMEDIATE | IN | INDICATOR
 | INITIALLY | INNER | INPUT
 | INSENSITIVE | INSERT | INT | INTEGER | INTERSECT
 | INTERVAL | INTO | IS
 | ISOLATION
 | JOIN
 | KEY
 | LANGUAGE | LAST | LEADING | LEFT
 | LEVEL | LIKE | LOCAL | LOWER
 | MATCH | MAX | MIN | MINUTE | MODULE
 | MONTH
 | NAMES | NATIONAL | NATURAL | NCHAR | NEXT | NO
 | NOT | NULL
 | NULLIF | NUMERIC
 | OCTET_LENGTH | OF
 | ON | ONLY | OPEN | OPTION | OR
 | ORDER | OUTER
 | OUTPUT | OVERLAPS
 | PAD | PARTIAL | POSITION | PRECISION | PREPARE
 | PRESERVE | PRIMARY
 | PRIOR | PRIVILEGES | PROCEDURE | PUBLIC
 | READ | REAL | REFERENCES | RELATIVE | RESTRICT
 | REVOKE | RIGHT
 | ROLLBACK | ROWS
 | SCHEMA | SCROLL | SECOND | SECTION
 | SELECT
 | SESSION | SESSION_USER | SET
 | SIZE | SMALLINT | SOME | SPACE | SQL | SQLCODE
 | SQLERROR | SQLSTATE
 | SUBSTRING | SUM | SYSTEM_USER
 | TABLE | TEMPORARY
 | THEN | TIME | TIMESTAMP
 | TIMEZONE_HOUR | TIMEZONE_MINUTE
 | TO | TRAILING | TRANSACTION
 | TRANSLATE | TRANSLATION | TRIM | TRUE
 | UNION | UNIQUE | UNKNOWN | UPDATE | UPPER | USAGE
 | USER | USING
 | VALUE | VALUES | VARCHAR | VARYING | VIEW
 | WHEN | WHENEVER | WHERE | WITH | WORK | WRITE
 | YEAR
 | ZONE

<SQL_special_character> ::=
   <space>
 | <double_quote>
 | <percent>
 | <ampersand>
 | <quote>
 | <left_paren>
 | <right_paren>
 | <asterisk>
 | <plus_sign>
 | <comma>
 | <minus_sign>
 | <period>
 | <solidus>
 | <colon>
 | <semicolon>
 | <less_than_operator>
 | <equals_operator>
 | <greater_than_operator>
 | <question_mark>
 | <underscore>
 | <vertical_bar>

<ampersand> ::= &

<approximate_numeric_literal> ::= <mantissa>E<exponent>

<area> ::= AREA <left_paren> <geometry_value_expression> <right_paren>

<as_clause> ::= [ AS ] <column_name>

<asterisk> ::= *

<between_predicate> ::=
   <value_expression> [ NOT ] BETWEEN
   <value_expression> AND <value_expression>

<boolean_factor> ::= [ NOT ] <boolean_primary>

<boolean_primary> ::=
   | <left_paren> <search_condition> <right_paren> 
     <predicate> 

<boolean_term> ::=
    <boolean_factor>
  | <boolean_term> AND <boolean_factor>

<box> ::= 
    BOX <left_paren> <coord_sys>
                   <comma> <coordinates> 
                   <comma> <numeric_value_expression> 
                   <comma> <numeric_value_expression> 
              <right_paren>

<catalog_name> ::= <identifier>

<centroid> ::= CENTROID <left_paren> <geometry_value_expression> <right_paren>

<character_factor> ::= <character_primary>
!! For information only, SQL92 supports <collate_clause>

<character_primary> ::= 
    <value_expression_primary>
  | <string_value_function>

<character_representation> ::= <nonquote_character> | <quote_symbol>
!! A way of allowing a single quote to be escaped by a single quote.

<character_string_literal> ::=
   <quote> [ <character_representation>... ] <quote>
   [ { <separator>... <quote> [ <character_representation>... ] <quote> }... ]

<character_value_expression> ::= <concatenation> | <character_factor>

<circle> ::= 
    CIRCLE <left_paren> <coord_sys> 
                <comma> <coordinates> 
                <comma> <radius> 
           <right_paren>

<colon> ::= :

<column_name> ::= <identifier>

<column_name_list> ::= <column_name> [ { <comma> <column_name> }... ]

<column_reference> ::= [ <qualifier> <period> ] <column_name>

<comma> ::= ,

<comment> ::= <comment_introducer> [ <comment_character>... ] <newline>

<comment_character> ::= <nonquote_character> | <quote>

<comment_introducer> ::= <minus_sign><minus_sign> [<minus_sign>...]

<comp_op> ::=
    <equals_operator>
  | <not_equals_operator>
  | <less_than_operator>
  | <greater_than_operator>
  | <less_than_or_equals_operator>
  | <greater_than_or_equals_operator>

<comparison_predicate> ::=
    <value_expression> <comp_op> <value_expression>

<concatenation> ::= <character_value_expression> <concatenation_operator> <character_factor>

<concatenation_operator> ::= ||

<contains> ::= CONTAINS <left_paren> <geometry_value_expression> <comma> <geometry_value_expression> <right_paren>  

<coord1> ::= COORD1 <left_paren> <coord_value> <right_paren>

<coord2> ::= COORD2 <left_paren> <coord_value> <right_paren>

<coord_sys> ::= <string_value_expression>

<coord_value> ::= <point> | <column_reference>

<coordinate1> ::= <numeric_value_expression>

<coordinate2> ::= <numeric_value_expression>

<coordinates> ::=  <coordinate1>  <comma>  <coordinate2> 

<correlation_name> ::= <identifier>

<correlation_specification> ::= [ AS ] <correlation_name>

<default_function_prefix> ::=
 !! The prefix is set by default to "udf_".
 !! It should be possible to change the default prefix to accommodate local preferences.

<delimited_identifier> ::= <double_quote> <delimited_identifier_body> <double_quote>
!! These diagrams are somewhat difficult to follow through and understand immediately. 
!! The idea is to allow any identifier to be quoted by enclosing within double quotes.
!! This can allow the use of reserved words as identifiers, eg: "DATE" as a column name.  
!! The identifier can also contain double quotes as long as they are escaped by double quotes. 
!! For example: "Table""X" will be identifier Table"X

<delimited_identifier_body> ::= <delimited_identifier_part>...

<delimited_identifier_part> ::= <nondoublequote_character> | <double_quote_symbol>

<delimiter_token> ::=
      <character_string_literal>
  |   <delimited_identifier>
  |   <SQL_special_character>
  |   <not_equals_operator>
  |   <greater_than_or_equals_operator>
  |   <less_than_or_equals_operator>
  |   <concatenation_operator>
  |   <double_period>
  |   <left_bracket>
  |   <right_bracket>

<derived_column> ::= <value_expression> [ <as_clause> ]

<derived_table> ::= <table_subquery>

<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

<distance> ::=     
   DISTANCE <left_paren> <coord_value> <comma> <coord_value> <right_paren>

<double_period> ::= ..

<double_quote> ::= "

<double_quote_symbol> ::= <double_quote><double_quote>

<equals_operator> ::= =

<exact_numeric_literal> ::=
    <unsigned_integer> [ <period> [ <unsigned_integer> ] ]
  | <period> <unsigned_integer>

<exists_predicate> ::= EXISTS <table_subquery>

<exponent> ::= <signed_integer> 

<extract_coordsys> ::= COORDSYS <left_paren> <geometry_value_expression> <right_paren>

<factor> ::= [ <sign> ] <numeric_primary> 

<from_clause> ::= FROM <table_reference>
   [ { <comma> <table_reference> }... ]

<general_literal> ::= <character_string_literal>
!! ADQL Note:
!! For information, SQL92 has <general_literal>, and also supports
!! <national_character_string_literal>, <bit_string_literal>, 
!! <hex_string_literal>, <datetime_literal> and <interval_literal>
!! Good to retain this as a place-holder, bearing in mind
!! that we might need <bit_string_literal> and <hex_string_literal>
!! at some point in the future?

<general_set_function> ::=
   <set_function_type> <left_paren> [ <set_quantifier> ] <value_expression> <right_paren>

<geometry_value_expression> ::= <value_expression_primary> | <geometry_value_function>

<geometry_value_function> ::=
     <box> 
   | <centroid> 
   | <circle> 
   | <point> 
   | <polygon> 
   | <region> 

<greater_than_operator> ::= >

<greater_than_or_equals_operator> ::= >=

<group_by_clause> ::= GROUP BY <grouping_column_reference_list>

<grouping_column_reference> ::= <column_reference> 
!! For information, SQL92 supports a <collate_clause>

<grouping_column_reference_list> ::=
    <grouping_column_reference> [ { <comma> <grouping_column_reference> }... ]

<having_clause> ::= HAVING <search_condition>	

<identifier> ::= <regular_identifier> | <delimited_identifier>

<in_predicate> ::=
    <value_expression> [ NOT ] IN <in_predicate_value>

<in_predicate_value> ::=
   <table_subquery> | <left_paren> <in_value_list> <right_paren>

<in_value_list> ::=
    <value_expression> { <comma> <value_expression> } ...

<intersects> ::= INTERSECTS <left_paren> <geometry_value_expression> <comma> <geometry_value_expression> <right_paren>

<join_column_list> ::= <column_name_list>

<join_condition> ::= ON <search_condition>

<join_specification> ::= <join_condition> | <named_columns_join>	

<join_type> ::=
   INNER
 | <outer_join_type> [ OUTER ]
!! For information, SQL92 also supports UNION

<joined_table> ::=
    <qualified_join>
  | <left_paren> <joined_table> <right_paren>

<keyword> ::= <SQL_reserved_word> | <ADQL_reserved_word>

<left_bracket> ::= [

<left_paren> ::= (

<less_than_operator> ::= <

<less_than_or_equals_operator> ::= <=

<like_predicate> ::=
    <match_value> [ NOT ] LIKE <pattern>

<mantissa> ::= <exact_numeric_literal>

<match_value> ::= <character_value_expression>

<math_function> ::=
    ABS <left_paren> <numeric_value_expression> <right_paren> 
	  | CEILING <left_paren> <numeric_value_expression> <right_paren> 
	  | DEGREES <left_paren> <numeric_value_expression> <right_paren> 
	  | EXP <left_paren> <numeric_value_expression> <right_paren> 
	  | FLOOR <left_paren> <numeric_value_expression> <right_paren> 
	  | LOG <left_paren> <numeric_value_expression> <right_paren> 
	  | LOG10 <left_paren> <numeric_value_expression> <right_paren> 	  
	  | MOD <left_paren> <numeric_value_expression> <comma> <numeric_value_expression> <right_paren> 	  
	  | PI <left_paren><right_paren> 
	  | POWER <left_paren> <numeric_value_expression> <comma> <numeric_value_expression> <right_paren> 
	  | RADIANS <left_paren> <numeric_value_expression> <right_paren> 
	  | RAND <left_paren> [ <numeric_value_expression> ] <right_paren> 
	  | ROUND <left_paren> <numeric_value_expression> [<comma> <signed_integer>] <right_paren> 
	  | SQRT <left_paren> <numeric_value_expression> <right_paren> 
	  | TRUNCATE <left_paren> <numeric_value_expression> [<comma> <signed_integer>] <right_paren> 

<minor_radius> ::= <numeric_value_expression>

<minus_sign> ::= -

<named_columns_join> ::= USING <left_paren> <join_column_list> <right_paren> 

<newline> ::= 
!! Implementation defined end of line indicator

<non_predicate_geometry_function> ::= 
   <area>
 | <coord1>
 | <coord2> 
 | <distance>

<nondelimiter_token> ::= 
    <regular_identifier>
  | <keyword>
  | <unsigned_numeric_literal>
 !! For information, SQL92 has some others. 
 !! (Jeff: Not sure I understand how this is used.)

<nondoublequote_character> ::= 
 !! See SQL92 syntax rules. But basically what it says.

<nonquote_character> ::= 
!! One ASCII character but not a single quote 

<not_equals_operator> ::= <not_equals_operator1> | <not_equals_operator2>

<not_equals_operator1> ::= <>

<not_equals_operator2> ::= !=

<null_predicate> ::= <column_reference> IS [ NOT ] NULL

<numeric_geometry_function> ::=
    <predicate_geometry_function> | <non_predicate_geometry_function>

<numeric_primary> ::=
    <value_expression_primary>
  | <numeric_value_function>

<numeric_value_expression> ::=
    <term>
  | <numeric_value_expression> <plus_sign> <term>
  | <numeric_value_expression> <minus_sign> <term>

<numeric_value_function> ::= 
   <trig_function> 
 | <math_function>	
 | <numeric_geometry_function>
 | <user_defined_function>
!! For information, SQ92 also supports <position_expression>,
!! <extract_expression> and <length_expression>

<order_by_clause> ::= ORDER BY <sort_specification_list>

<ordering_specification> ::= ASC | DESC

<outer_join_type> ::= LEFT | RIGHT | FULL

<pattern> ::= <character_value_expression>

<percent> ::= %

<period> ::= .

<plus_sign> ::= +

<point> ::= POINT <left_paren> <coord_sys> <comma> <coordinates> <right_paren>

<polygon> ::=
    POLYGON <left_paren> <coord_sys> 
                 <comma> <coordinates> 
                 <comma> <coordinates> 
               { <comma> <coordinates> } ?
            <right_paren>

<predicate> ::=
   <comparison_predicate>
 | <between_predicate>
 | <in_predicate>
 | <like_predicate>
 | <null_predicate>
 | <exists_predicate>
!! For information, SQL92 also supports the following predicates: 
!! <quantified_comparison_predicate>, <unique_predicate>, 
!! <match_predicate> and <overlaps_predicate>

<predicate_geometry_function> ::= <contains> | <intersects>

<qualified_join> ::=
    <table_reference> [ NATURAL ] [ <join_type> ] JOIN
    <table_reference> [ <join_specification> ]

<qualifier> ::= <table_name> | <correlation_name> 

<query_expression> ::=
     <query_specification>
   | <joined_table>
 !! ADQL Note: SQL92 uses <non_join_query_expression> rather than <query_specification>

<query_specification> ::=
    SELECT [ <set_quantifier> ] [ <set_limit> ] <select_list> <table_expression>

<question_mark> ::= ?

<quote> ::= '

<quote_symbol> ::= <quote> <quote>

<radius> ::= <numeric_value_expression>

<region> ::= REGION <left_paren> <string_value_expression> <right_paren>

<regular_identifier> ::=
    <simple_Latin_letter>... 
    [ { <digit> | <simple_Latin_letter> | <underscore> }... ]

<right_bracket> ::= ]

<right_paren> ::= )

<schema_name> ::= [ <catalog_name> <period> ] <unqualified_schema name>

<search_condition> ::=
    <boolean_term>
  | <search_condition> OR <boolean_term>

<select_list> ::=
    <asterisk>
  | <select_sublist> [ { <comma> <select_sublist> }... ]

<select_sublist> ::= <derived_column> | <qualifier> <period> <asterisk>

<semicolon> ::= ;

<separator> ::= { <comment> | <space> | <newline> }...

<set_function_specification> ::=
    COUNT <left_paren> <asterisk> <right_paren>
  | <general_set_function>

<set_function_type> ::= AVG | MAX | MIN | SUM | COUNT

<set_limit> ::= TOP <unsigned_integer>

<set_quantifier> ::= DISTINCT | ALL

<sign> ::= <plus_sign> | <minus_sign>

<signed_integer> ::= [ <sign> ] <unsigned_integer>

<simple_Latin_letter> ::=
    <simple_Latin_upper_case_letter>
  | <simple_Latin_lower_case_letter>

<simple_Latin_lower_case_letter> ::=
    a | b | c | d | e | f | g | h | i | j | k | l | m | n | o
  | p | q | r | s | t | u | v | w | x | y | z

<simple_Latin_upper_case_letter> ::=
    A | B | C | D | E | F | G | H | I | J | K | L | M | N | O
  | P | Q | R | S | T | U | V | W | X | Y | Z

<solidus> ::= /

<sort_key> ::= <column_name> | <unsigned_integer> 
 !! For information, the use of <unsigned_integer> in <sort_key>is a deprecated feature of SQL92

<sort_specification> ::=
    <sort_key> [ <ordering_specification> ]
!! For information, SQL92 also supports <collate_clause>

<sort_specification_list> ::=
    <sort_specification> [ { <comma> <sort_specification> }... ]

<space> ::=
!! Imagine there is a space character here

<string_geometry_function> ::= <extract_coordsys>

<string_value_expression> ::=
    <character_value_expression>
!! For information, SQL92 also supports <bit_value_expression>

<string_value_function> ::= <string_geometry_function> | <user_defined_function>

<subquery> ::= <left_paren> <query_expression> <right_paren>

<table_expression> ::=
    <from_clause>
    [ <where_clause> ]
    [ <group_by_clause> ]
    [ <having_clause> ]
    [ <order_by_clause> ]

<table_name> ::= [ <schema_name> <period> ] <identifier>

<table_reference> ::=
   <table_name> [ <correlation_specification> ]
 | <derived_table> <correlation_specification>
 | <joined_table>

<table_subquery> ::= <subquery>

<term> ::=
    <factor>
  | <term> <asterisk> <factor>
  | <term> <solidus> <factor>

<token> ::= 
    <nondelimiter_token>
  | <delimiter_token>

<trig_function> ::=
    ACOS <left_paren> <numeric_value_expression> <right_paren>
  | ASIN <left_paren> <numeric_value_expression> <right_paren>
  | ATAN <left_paren> <numeric_value_expression> <right_paren>
  | ATAN2 <left_paren> <numeric_value_expression> <comma> <numeric_value_expression> <right_paren>
  | COS <left_paren> <numeric_value_expression> <right_paren>
  | COT <left_paren> <numeric_value_expression> <right_paren>
  | SIN <left_paren> <numeric_value_expression> <right_paren>
  | TAN <left_paren> <numeric_value_expression> <right_paren>

<underscore> ::= _

<unqualified_schema name> ::= <identifier>

<unsigned_integer> ::= <digit>...

<unsigned_literal> ::= <unsigned_numeric_literal> | <general_literal>

<unsigned_numeric_literal> ::=
   <exact_numeric_literal>
 | <approximate_numeric_literal>

<unsigned_value_specification> ::= <unsigned_literal>

<user_defined_function> ::=
    <user_defined_function_name> 
        <left_paren> 
            [ <user_defined_function_param> [ { <comma> <user_defined_function_param> }... ] ]
        <right_paren> 
!! ADQL Note:
!! If metadata on a user defined function is available, this should be used.
!! For example, function names and cardinality of arguments should be checked.

<user_defined_function_name> ::=
     [ <default_function_prefix> ] <regular_identifier>
 !! Function names should be checked against metadata where available. 

<user_defined_function_param> ::= <value_expression>

<value_expression> ::=
    <numeric_value_expression>
  | <string_value_expression>
  | <geometry_value_expression>

<value_expression_primary> ::=
    <unsigned_value_specification>
  | <column_reference>
  | <set_function_specification>
  | <left_paren> <value_expression> <right_paren>
!! For information, SQL92 supports <scalar_subquery>,
!! <case_expression> and <cast_specification>

<vertical_bar> ::= |

<where_clause> ::= WHERE <search_condition>