ADQL Erratum 2

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue May 30 11:31:01 CEST 2017


Dear DAL,

On Tue, May 30, 2017 at 08:46:09AM +0200, Marco Molinaro wrote:
> this part of the erratum 2 was simply meant to align what BNF says
> to what the description in the table says.

Yes.  I think it is uncontenious that this needs aligning.  I don't
think the issue is grave enough to warrant a change in the BNF, so
Walter's (understandable) wish to entirely remove the argument won't
work for me.

However,

> Now, I understand that different backends act differently, and that
> the usage of the seed value itself is quite confused/confusing,
> but if we go too far, we may slip out of the context for an erratum.

goes against ADQL's purpose of having an interoperable syntax and
semantics for an SQL dialect.  Worse, at least some backends (Gregory
tried it with H2; my former implementation did about the same) have
utterly useless and dangerous behaviour: they seed the RNG in each
row, which means that all calls to random() yield the same value(s)
in all rows.

Given that, I think we have to warn people that using the argument
will very likely not do what at least I would expect: for each
occurrence of random(n) generate an independent, reproducable
sequence 

  row-index -> pseudo random number

I'm not even sure if there is any DB engine implementing this at all.

Explaining this in this many words is, I'd say, a bit too much for
this table.  So, here's my second attempt to provide a text for the
table:

  rand([x]) -- Returns a random value between 0.0 and 1.0.  The
  optional argument, originally intended to provide a random seed,
  should not be used.  Behaviour for the function with an argument is
  undefined.  Query writers should not use it.

Opinions?

      -- Markus


More information about the dal mailing list