ADQL Erratum 2
Grégory Mantelet
gmantele at ari.uni-heidelberg.de
Mon May 29 11:23:28 CEST 2017
Dear Markus, DAL,
As in DACHS, the ADQL-Library does consider the parameter of rand(x) as
optional.
I think that I originally ignored the human description where the
parameter is not written as optional.
Instead, I focused on its BNF description, where, on the contrary, the
parameter is optional:
RAND <left_paren> [ <unsigned_integer> ] <right_paren>
It really seems there was an inconsistency in the ADQL 2.0 document on
that point.
So, to answer to your questions:
> (1) Does anyone actually implement random(int) (rather than just
> falling back to random())? And if so, what do you do?
For PostgreSQL, I also translate the ADQL rand(x) into the SQL random().
So I completely ignore the seed parameter if any is given.
However, it seems that for SQLServer, the function rand([x]) exists as
described in ADQL 2.0
and so that's how the SQLServer translator of my library translates it:
exactly like in ADQL.
Similarly the MySQL and the H2 database (but not SQLite) have also the
same optional parameter for rand: rand([x]).
> (2) Wouldn't it be preferable if we said, in the erratum, as the new
> text:
>
> rand([x]) -- Returns a random value between 0.0 and 1.0. The
> argument was initially intended to provide a random seed, if given.
> It turned out, however, that in concept and implementation, it is
> hard to attach stable semantics to this notion. Hence, while an
> argument is accepted for backward compatibility, clients should
> expect that the 1-argument function behaves exactly like the
> 0-argument one.
>
> Or something like this?
I completely agree to make the seed parameter optional.
But since this seed parameter is optionally accepted by some DBMS used
on some existing TAP implementation, I am not entirely convinced that we
should disable the possibility to use a seed parameter. Then, I don't
have a strong opinion about the random generation and the need of
"regenerating" the random numbers with a seed in a database usage.
So, why not saying that this optional parameter may be ignored by some
ADQL implementation?
Like that, it works with everybody and I don't actually think that a
client/user will really notice the difference and may complain....but I
may be wrong here.
Cheers,
Grégory
On 05/29/2017 10:10 AM, Markus Demleitner wrote:
> Dear DAL,
>
> There's currently ADQL Erraturm 2,
> http://wiki.ivoa.net/twiki/bin/view/IVOA/ADQL-2_0-Err-2 under
> discussion.
>
> While I believe the main content is essentially uncontentious (I
> personally would prefer square brackets around optional arguments),
> I'm not so sure about:
>
> rand(x)
> Returns a random value between 0.0 and 1.0, *where x is an
> optional seed value*.
>
> Frankly, I believe we need to say a bit more about this if we expect
> it to work.
>
> I *think* what the authors intended here was:
>
> "If an argument to RAND is given, a single call to a setseed-like
> function should be performed in the transaction that will later be
> used to execute the query itself."
>
> It certainly makes no sense to set the seed as part of the query
> itself (you'd then get *very* unrandom numbers indeed).
>
> Full disclosure: DaCHS currently does neither: random(n) is
> translated to the same query as random(). My rationale is that I
> doubt you'll get any sort of reproducability (which setseed is about)
> either way, given that it's not clear if the PRNG is per-transaction
> or might be pushed along by queries executed in parallel (Postgres
> docs aren't clear here) and that, with set calculus and the query
> planner in the background, you can't really expect a particular
> sequence of rows and hence a particular sequence of random numbers by
> rows.
>
> Of course, it's also a pain to implement that extra query one would
> need for halfway reasonable behaviour.
>
> So:
>
> (1) Does anyone actually implement random(int) (rather than just
> falling back to random())? And if so, what do you do?
>
> (2) Wouldn't it be preferable if we said, in the erratum, as the new
> text:
>
> rand([x]) -- Returns a random value between 0.0 and 1.0. The
> argument was initially intended to provide a random seed, if given.
> It turned out, however, that in concept and implementation, it is
> hard to attach stable semantics to this notion. Hence, while an
> argument is accepted for backward compatibility, clients should
> expect that the 1-argument function behaves exactly like the
> 0-argument one.
>
> Or something like this?
>
> -- Markus
More information about the dal
mailing list