VOTable for simulations

Gerard gerard.lemson at mpe.mpg.de
Tue Aug 29 10:01:58 PDT 2006


Hi Claudio
Sorry for the late reply to this email. I'm Cc-ing the theory group as well

I gather you are thinking of grid simulation data here, so this mail does
not apply to N-body. Anyway, for that I think we can use the VOTable spec as
it stands, in particular section 5.3 dealing with binary serialisation (see
http://www.ivoa.net/Documents/REC/VOTable/VOTable-20040811.pdf ).

In the case you address, would it make sense to try to mimick FITS in the
naming of key words, so use NAXIS for rank, and NAXIS1 for size0, NAXIS2 for
size1 etc for the dimensions ? If I am not mistaken VOTable itself is based
on the FITS binary table spec, so your proposal might be seen as a
translation of a FITS datacube (IMAGE). Did we actually not think about
using FITS as is for (uniform) grid simulations ? In that case your proposal
could also be used I guess, where iso STREAM we'd have FITS as in standard
VOTable usage (though I don't know whether votable presumes that the FITS
file contains a table).

I am not sure whether FITS images/datacubes allow multiple values per cell
(i.,e. have an array size), but don't think so. Otherwise we could probbaly
generalise in that direction. 
Do you propose to follow the VOTable/FITS directions on little-vs big-endian
?

Cheers

Gerard



> -----Original Message-----
> From: Claudio Gheller [mailto:c.gheller at cineca.it]
> Sent: Thursday, July 20, 2006 12:37 PM
> To: Gerard Lemson; Ugo Becciani; Alessandro Costa; Marco Comparato; R.
> Smareglia
> Subject: VOTable for simulations
> 
> Dear friends,
> 
> I have tried to figure out the structure of a VOTable for simulated
> data. In the following the result.
> I made the following assumptions:
> 
> 1. data are binary
> 2. the binary file is a raw stream of byte, with no structure (no fits,
> no hdf...). It is external to the VOTable (at the moment I've not
> considered base64 conversion for performance reasons)
> 3. Each file has an  XML descriptor associated. The descriptor at
> present gives only the necessary infos to deal with the file.
> 4. Each file contains ONE variable. This is suggested for the following
> reasons
> - data rank and size can change from variable to variable.
> - complex description
> - The association direct XML header file - bin file - variable, is
> easier to handle.
> - smaller files
> - files easier to handle by external applications (also not VO-compliant)
> - drawback: proliferation in the number of files
> However we can consider the support to more complex files or even
> formats, like FITS or HDF5. But let's start with something simple.
> 
> At this point I made the Snap program create binary files (at present
> still HDF5, but just for backward compatibility) and associated XMLs.
> For example:
> test.h5 ----> snapped data
> test.h5.xml ----> associated VOTable:
> 
> <?xml version="1.0"?>
> <VOTABLE xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xmlns="http://vizier.u-strasbg.fr/xml/VOTable-1.1.xsd
> ">
>         <RESOURCE name=myTestResource>
>                 <TABLE name="BmTemperature" ID="MyTestTable" >
>                         <FIELD name="BmTemperature" ID="myTestObject"
> ucd="" datatype="float" arraysize="41x41x41" unit="Kelvin" />
>                         <PARAM name="rank" datatype="int" value="3"/>
>                         <PARAM name="size0" datatype="long" value="41"/>
>                         <PARAM name="size1" datatype="long" value="41"/>
>                         <PARAM name="size2" datatype="long" value="41"/>
>                         <DATA><BINARY>
>                         <STREAM href="file:///scratch/myhome/test.h5"/>
>                         </BINARY></DATA>
>                 </TABLE>
>         </RESOURCE>
> </VOTABLE>
> 
> Notice that the rank and size of the dataset is expressed in the
> arraysize keyword of FIELD. It is also written in the 4 PARAM fields.
> This is just to avoid the parsing of the string to get the basic info of
> rank and size and to have them directly as numbers (with their precise
> type). At present there are no UCD and no reference to the SNAP
> protocol, since both are not yet defined. I'm working on the latter...
> 
> This is the very first attempt!!! Let me know all your comments.
> Claudio
> 
> --
> ------------------------------------
> Dr. Claudio Gheller, Ph.D.
> High Performance System Division
> CINECA - Bologna - Italy
> Tel. +39-051-6171560
> Fax. +39-051-6137273
> ------------------------------------




More information about the theory mailing list