Nulls in VOTables in TAP
Francois Ochsenbein (ext.52429)
francois at cdsarc.u-strasbg.fr
Mon Jul 4 05:38:23 PDT 2011
Hi Tom,
I basically agree with all of Mark Taylor's answers:
* yes, VOTable was designed on the basis of FITS, not as
a DBMS subset -- NaN and a database 'null' are considered
as the same thing as it is in fits binary table; and
in the case of an array of floats/doubles in <TABLEDATA>
seralization, a simple space can't work, hence the "NaN"
alternative of the empty <TD/>...
* yes there is some confusion for the boolean, the FITS
document indicates only the possibilities T F and hexa 00
(but the hexa 00 can't be used for an array in the <TABLEDATA>
seralization, problem similar to the NaN for doubles)
* for integers, no bit pattern exists for undefined value.
It is just "suggested" in the section 4.7 to use the value
-32768 for short integers.
In fact the lowest integer numbers are frequently used as the
bit pattern for "null" integers (the lowest integer numbers
are their own opposite); these numbers are:
-32768 (0x8000) for short int,
-2147483648 (0x80000000) for 32-bit integers,
-9223372036854775808 (0x8000000000000000) for longs
These values are those assigned by the gnu C compiler
(and fortran as far sa I know) in instructions like
i = x
if x is a double with NaN value and i is an integer.
Unfortunately, it seems that the java compiler does not use
the same convention, a Double.shortValue/intValue/longValue()
returns a value of zero as the corresponding integer of a
NaN double...
Cheers, francois
>
>My recent security issues have caused me to relook at some of the
>formatting options for VOTables and in doing so I've become a bit
>confused about how database nulls should be handled properly. It
>doesn't look like any VOTable representation can do a proper job of
>handling nulls as they appear in databases consistently with the
>recommendations of the VOTable standard.
>
>The TABLEDATA representation could do pretty well. It could in
>principle represent nulls for most types by having empty text in the
>appropriate TD element. This could work for all types except that it
>cannot distinguish between 0 length arrays and null arrays. Most
>databases allow for 0 length strings distinct from null strings so
>that's a bit of an issue but we can probably live with it. However
>the VOTable standard seems to suggest that using empty string values
>is not supported for anything other than boolean and float/complex
>data types. [The text is actually a bit confused here. E.g., at one
>point (4.7) it suggests that booleans will require a value attribute
>to specify a null, but later (6) on it describes how nulls should be
>represented for that type and makes the empty cell the default way.]
>
>E.g., if I have an 'int' field and represent the value of this field
>in some row with just <TD/> the interpretation of that value seems to
>be undefined by the standard.
>
>The VOTable standard also suggests conflating the ideas of null and
>NaN for floating point values. If I have a 'double' field, then the
>standard suggest that <TD/> should be interpreted as identical to
><TD>NaN</TD>. These are very distinct in the database world but it
>looks like this distinction may be lost when we return results using TAP.
>
>In the BINARY and FITS serializations there is no natural way to
>represent null values for any types. The only avenue is to use the
>value/null attribute. The conflation of null and NaN numbers is
>explicitly mandated.
>
>For all representations there is a significant penalty for the short
>integer types (bytes, shorts and ints), where collisions between null
>values and actual occurrences of any reserved value are likely.
>
>One solution for TAP services might be to promote integer types.
>E.g., if I have a short in the underlying database I could represent
>it as an int in TAP so that I can be assured of not having collisions
>in the VOTable response.
>
>However it's all pretty inelegant for me at least. Am I
>misunderstanding something here? As far as I can tell neither the
>ADQL nor TAP standards actually talk about null values (except that
>TAP notes in some cases that certain metadata values are null) so the
>VOTable standard is where the action is.
>
> Regards,
> Tom
=======================================================================
Francois Ochsenbein ------ Observatoire Astronomique de Strasbourg
11, rue de l'Universite 67000 STRASBOURG Phone: +33-(0)368 85 24 29
Email: francois at astro.u-strasbg.fr (France) Fax: +33-(0)368 85 24 17
=======================================================================
More information about the dal
mailing list