Inputs for UCDs [this part: IEEE in ADQL]

Wed Oct 16 05:19:11 CEST 2019

Hm, this sounds to me like going back many years when people complained 
about SQL Server not supporting proper IEEE 754 numbers:

https://www.red-gate.com/simple-talk/blogs/from-nan-to-infinity-and-beyond/

NaN, INF and NULL are quite distinct semantically and there are good reasons
to be able to tell them apart rather than throwing them together and say NULL
instead in particular when talking about scientific applications.

NULL: value has not been provided
NaN: a calculation resulted in an invalid floating point result (e.g. 0/0)
INF: a calculation resulted in an overflow (e.g. x/0)

Now IEEE 754 has quite a number of oddities all by itself, but that’s what we are mostly
stuck with right now. IEEE 754 does not have the equivalent of NULL, but it does have
MANY NaN values, which can all be distinguished and in some cases this is horribly
mis-used, also to use a whole set of them to represent the equivalent of NULL. Thus
I for sure don’t like not being able to tell NaN and NULL apart. Its a bit like some DBs
don’t even support NULL and you have to misuse a value for that…

BTW: Not having NULL in IEEE 754 is causing a lot of issues as well, since this is 
essentially the root cause for us having to initialise variables with something that does
not harm the initial conditions of an algorithm. 

Cheers,
Andreas
-----------------------------------------------------------
Professor of Data Intensive Research
International Centre for Radio Astronomy Research
The University of Western Australia
M468, 35 Stirling Highway
CRAWLEY WA 6009
Australia

Tel.: (+61) (0) 8 6488 7847

> On 13 Oct 2019, at 15:15, Mark Taylor <m.b.taylor at bristol.ac.uk> wrote:
> 
> On Sun, 13 Oct 2019, Markus Demleitner wrote:
> 
>> (2) ADQL needs at least an isnan operator?  Me, I think that the
>> semantic difference between SQL NULL and IEEE NaN is so slim that we
>> probably shouldn't map it in ADQL land.  At least, it's not
>> maintained in TABLEDATA and BINARY VOTable serialisations, which
>> would make any possible benefit on the ADQL side questionable.
>> Hence, I'd say we should discourage having NaNs in float-valued
>> columns and tell people to have proper NULLs there instead.  Matters
>> become murkier when ADQL grows actual support for arrays, though.
>> But let's talk about that when we properly specify arrays.
> 
> The idiom "x!=x" can be used to test for NaN-ness in some languages,
> that might be usable as an ADQL NaN test in absence of a specific ISNAN
> operator.
> 
>> (3) Just clarify that you may get back IEEE magic values from TAP
>> queries?  I guess that's more or less implied, except that, as I
>> said, NaN is not reliably representable in VOTables.  If that's what
>> you'd like to see, where would you make that clarification?
> 
> I take issue with "NaN is not reliably representable in VOTables".
> It is, but certain processing combinations may make it hard to
> distinguish between something that was a NaN and something that
> was a NULL a DB table that it came from.
> 
> Mark
> 
> --
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/