Handling data cubes in VO

Doug Tody dtody at nrao.edu
Wed Dec 14 14:06:39 PST 2005


A small group of us met to discuss how to handle data cubes in VO, prompted
by a query from Arecibo on how to publish data from an upcoming HI
survey to the VO.  The conclusions from the meeting are summarized below.

In short, since a data cube is a type of regularly-gridded pixel array it
is probably best handled as an image by extending SIA to handle 3D images.
Whole image access is generally impractical since the cubes are so large; 2D
slices are occasionally useful but are generally not adequate for analysis.
Hence the most common type of access is likely to be a 3D subset of the
cube data, produced either as a cutout or by resampling.  Since cubes
can be very large they may actually be stored as multiple data files
in an archive, with the cutout generated from pieces of multiple files.
In the case of fully processed cubes from a radio survey the Z-axis of
the cube is most likely to be some form of velocity, hence the ability to
query by velocity (relative to some specified reference frame) is important.

A typical use-case would be for the user to use a tool such as the Karma
kpvslice, running interactively on the user workstation, to visualize
data coming from a 3D cutout or resampling service running remotely on the
data server.  VO-Client tools could be used to locate and retrieve the data.

Comments on this analysis are welcome.  One conclusion is that it is a
priority to address 3D data in the next version of SIA.   - Doug


---------- Forwarded message ----------
Date: Wed, 14 Dec 2005 12:56:18 -0700 (MST)
From: Doug Tody <dtody at nrao.edu>
To: Roy Williams <roy at cacr.caltech.edu>
Cc: Steven Gibson <gibson at naic.edu>, John Benson <jbenson at nrao.edu>,
     Arnold Rots <arots at head.cfa.harvard.edu>
Subject: Re: VO for exposing Arecibo data

For the record, some notes from our meeting:

     o	SGPS (ATCA/Parkes) and CGPS are some good current examples of
 	radio spectral data cube data of the sort we need to deal with.

 	Interestingly, at NRAO we don't have much in the way of data cubes
 	to publish to the VO.  It is more common to have "multi-band"
 	data with 3-4 samples (e.g., Stokes I, Q, U) in the Z image axis.
 	These are represented as 3D FITS images but a really more multi-band
 	data than a true 3D observation (in SIA we would probably represent
 	them as 3-4 2D images forming a logical group).  Spectral line
 	data from VLA/VLBA, or OTF scans from GBT can produce cubes,
 	but at present generally only the PI sees this data.

 	Most radio cube data we are likely to need to deal with has XY
 	as the spatial axes and Z as the spectral axis.  Most commonly
 	the observable is velocity in some defined standard of rest.
 	Frequency or wavelength is also seen but mainly for observational
 	data.  (Hence being able to query by velocity is quite important
 	for this data).

     o	In general true 3D cubes from modern instruments are impractical to
     	retrieve over the network.

     o	By far the most important form of access appears to be some form of
 	cutout.  We can either cut out a smaller 3D cube, or dimensionally
 	reduce the data to produce 2D slices aligned to the image axes.
 	The ability to resample or reproject the data is also important.
 	Both of these cases represent 3D generalizations of what is
 	already done in SIA.

     o	2D visualization of cubes is not generally very useful.  The most
 	common use case is to pull out a smaller 3D cube and visualize
 	or analyze it locally using 3D tools such as Karma etc. provide.

     o	The ability to handle 3D data should be a priority for the next
     	version of SIA.  Cube data is most naturally dealt with as a type
 	of "image" data.

     o	There are use cases where cubes with time on the Z axis are also
     	important (the spectral axis and the time axis can both have
 	arbitrarily many samples, as can the spatial axes).


   - Doug



More information about the dal mailing list