Handling data cubes in VO

Tue Dec 27 09:54:29 PST 2005

What is being described below is a WCS.  A WCS defines a mapping from some
physical or measurement coordinate system, such as a pixel array, to some
arbitrary, possibly nonlinear, world/user coordinate system.  The mapping
can be one-to-many, i.e., there can be multiple world systems defined
for the same physical system.  The physical system does not have to be a
pixel array, but normally it is some linear cartesian system.  For example,
one can associate a WCS with event coordinates in some detector coordinate
system, instead of with sampled pixels.  While a pixel array is regularly
sampled, this is merely a very important special case of a more general
formalism.  One can also define a WCS for irregularly sampled data.

A WCS is an object which one can associate with a data array, and event
stream, a graphics window, whatever.  The important thing is the WCS
object can exist and be used independently of any associated data array.
An implementation provides methods for forward and reverse transforms
and the like.  (What Dave Berry described is an example of this but there
are many others).

"Cutout" operations do not resample the data and are most naturally
expressed in physical coordinates, although world coordinates will work
as well if the WCS transformation is well defined.  When we talk about
"making the subscripts real" this is a reprojection, often defined in
world coordinates.  In general this involves resampling the data and there
are many ways in which this can be done.  The details of the interpolation
performed are more the domain of the application.

For a concrete example consider the case of a 2D slice through a 3D cube
at an arbitrary position and angle.  This is a 2D reprojection expressed in
3D coordinates.  The scale can be changed at the same time.  This operation
can be specified by defining the parameters of the WCS of the data the
client wants to get back, plus possibly some image generation parameters
such as the "naxis" array, e.g. to ensure that the image returned is 2D.

In terms of implementation, what the service does for this use-case is
define a transformation from the WCS of the client-specified output image
(the 2D slice) to the pixel array of the cube.  This transformation is
a new WCS, computed on the fly by the service to perform the resampling,
and then discarded.  For each pixel in the output 2D image (3D slice) the
new WCS transformation defines the fractional pixel coordinates within the
3D cube.  The 3D interpolation scheme of choice is then used to compute
the resampled pixel.  These are complex operations, but software already
exists to do this sort of thing.  It is not specific to VO.

To summarize, the WCS concept is central to dealing with things like cubes.
We also need "cutout" and "reprojection".  A regularly sampled data
cube is a 3D "image".  "Image" is just a multidimensional array with
associated metadata.  Given these concepts we can slice and dice cubes
(or 2D sky projection images) with as much or as little control over the
operation as desired.

Do not confuse the concrete cases of the FITS image and FITS WCS with these
more powerful underlying concepts.  Although the FITS implementations may
appear crude, and tend to conflate the data model and the serialization,
the concepts and underlying data models are the right ones.    - Doug

------
On Mon, 26 Dec 2005, Roy Williams wrote:
> On Dec 26, 2005, at 1:11 PM, Arnold Rots wrote:
>> Part of this, especially item 5, is really a DM discussion.
>> David Berry proposed that the relation between physical coordinate
>> axes and pixel coordinates is really a transformation between two
>> separate coordinate systems (and provided an implementation).
>
> There is as always a decision about where to stop on the road to complexity. 
> The first stage is to make the subscripts real with linear interpolation, so 
> that W(3.2) is defined to be 0.8*W(3)+0.2*W(4). Next we could offer a choice 
> of interpolation schemes. Next we can identify a transformation to a physical 
> coordinate (linear and log axes etc). We can have subscripts for both 
> independent and dependent quantities -- irregularly spaced data.
>
> I guess I would want to keep things as simple as possible, get a solid 
> implementation and users, and only add sophistication when a real genuine use 
> case -- or two or three -- scream their heads off and demand more complexity. 
> That's just how I would do it.
>
> Roy
>
> California Institute of Technology
> 626 395 3670
>