ADQL-2.1 internal draft

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu Jul 9 15:44:59 CEST 2015


Hi Marco,

On Mon, Jun 22, 2015 at 04:36:30PM +0200, Marco Molinaro wrote:
> Hi Mark, hi all,
> 
> I fixed what's not answered here and tried to clarify a bit the references
> (but there's still some work to do for the bibrefs).
> 
> 2015-06-22 13:41 GMT+02:00 Mark Taylor <m.b.taylor at bristol.ac.uk>:
> > Sec 2.1.2:
> >
> >    The syntax quoted here:
> >
> >       "<Latin_letter> [{ <underscore> | {<Latin_letter> | <digit>} }]"
> >
> >    seems to be missing at least one ellipsis, compared to that in
> >    the BNF in Appendix A.  By my reading the above syntax limits
> >    identifiers to two characters.  This text hasn't changed since
> >    ADQL v2.0 though, so if this is an error it's not new in this version.
> 
> I changed it accordingly to the BNF in appendix, I agree with you the
> above read as two characters only.

Uhhh.  I suddenly notice that SQL92's EBNF uses curly braces for
grouping.  That's bad, but it explains why there are the ellipses and
the square brackets.

Well, most versions of EBNF I've seen say "{x}" means "zero-or-more or
x".  SQL92, 3.2 says


         { }   Braces group elements in a formula. The portion of the for-
               mula within the braces shall be explicitly specified.

-- and then goes on to define the ellipsis, in effect, as Kleene
star.

That's... ummm... suprising, and the only reason it's not bitten me
is that in the ADQL grammar, {} almost always is used together with
an ellipsis.

The only place it's not, whoever wrote the rule probably had my
concept of {} in mind:

<polygon> ::=
  POLYGON <left_paren>
  <coord_sys>
  <comma> <coordinates>
  <comma> <coordinates>
  { <comma> <coordinates> } ?
  <right_paren>

I strongly suggest that rule should really be (and I suspect everyone
implemented it like that):

<polygon> ::=
  POLYGON <left_paren>
  <coord_sys>
  <comma> <coordinates>
  <comma> <coordinates>
  <comma> <coordinates>
  { <comma> <coordinates> }...
  <right_paren>

(I assume here we want at least three coordinates).  That's a spec
bug that must be fixed.


Given we copy the funky usage of {}, I'd suggest to add a strong

  *Note*: The usage of {} in the EBNF in this specification follows the
  usage in SQL92, i.e., it is used for grouping.  It is *not* used as
  a zero-or-more operator as frequently done in other kinds of EBNF.


> > Sec 2.3:
> >
> >    Like Walter, I don't really undertand rand(x) - as far as I can see
> >    since it takes a seed it would be determinstic, which isn't
> >    very useful.  Am I missing something?
> 
> I wonder if this is just another errata...i.e. it could be considered
> implicit the idea of having a null seed.
> In my experience, however, I found implementations of rand that actually
> required a seed (letting, e.g., be the milliseconds of the computer internal
> timer to rule the randomness) ...probably not so clever, it turns out to be a
> pseudo-random generator.
> 
> My guess is that we can change the description explicitly letting x be null
> or even letting x be omitted, if we go the "overload" way it seems we are
> already facing for geometric functions.
> What do DAL people think of this?

The prose needs to be fixed, the rule is ok:

| RAND <left_paren> [ <unsigned_integer> ] <right_paren>

-- as you can see, the argument is optional.

I'm not terribly convinced reproducible sequences of random numbers,
as important as they are in simulations and such, have such a big use
case in ADQL, and this thing required me to write horrible code like

		if len(node.args)==1:
			return "setseed(%s)-setseed(%s)+random()"%(flatten(node.args[0]),
				flatten(node.args[0]))
		else:
			return "random()"

in my ADQL-to-Postgres morpher.

If you wanted to chuck out the argument, I'd therefore not complain.
If you keep things as they are, the docs on p. 14/chapter 2.3 need to
be brought in sync with the grammar.


Cheers,

        Markus



More information about the dal mailing list