Use case analysis for handling data cubes in VO

Doug Tody dtody at nrao.edu
Mon Feb 27 14:52:56 PST 2006


Hi Anita -

Thanks for the time series use-cases, in particular the Maser movie is
interesting.

By "uniformly sampled" I did not mean to imply that the data has to be
regularly sampled, with equally-spaced sample bins.  At least, not in what
is typically the third axis of the cube.  As you and others have pointed
out, spectral channels/bands or time samples are often irregularly spaced.

By uniform all I meant is that the data is sampled in a consistent fashion
and that it fits into the basic "image" data model, which is a N-D numeric
array of a single datatype, with an associated WCS.  The samples can be
irregularly spaced so long as this can be described by the WCS.  It is
straightforward to describe a 1D axis which is irregularly sampled; in
the worst case one merely gives the WC value of each sample.  It is much
harder if data is irregularly sampled within two coupled axes (a 2D plane)
however we can probably ignore this case.


On Thu, 23 Feb 2006, Anita Richards wrote:
> TopCat (www.starlink.ac.uk/topcat) now offers a number of 3D visualisation 
> options for tabular data which recognises celestial coordinates etc. My 
> feeling is that as we go to more or less than 2D data the distinction between 
> tabular data and 'image' data is more and more blurred. On the one hand, many 
> 1D spectra and time series are simply tables of freq (or time etc.) v. flux 
> or other observable, possibly plus uncertainties. On the other hand, 3+D 
> maser data or other observations of collections of compact sources are often 
> presented as tables with headings something like
> TIME RA DEC Bmaj BMin BPA VEL Iflux Qflux UFlux Vflux (errors.. ...)

I agree, especially when it comes to visualization and analysis - data
could be represented in either array or tabular form and still visualized
in the same fashion.  The distinction is still important however at the
level of data representation, particularly if we are trying to represent
and process bulk data efficiently.

> CHARACTERISATION
> These data have at least 5 types of axes, some of which are themselves 
> multiple (Space, Time, Frequency, Velocity, Polarization).

Yes, I agree in the most general case we have to deal with all of this,
although velocity is derived and not a true physical measure.

> Access modes
> Spectrum extraction - add - or an analogous extraction parallel to any other 
> axis, e.g. to produce a light curve or variability with time.

I agree we should mention this as well.  SSA will already allow extraction
of a time series as well as a 1D spectrum or SED.

> We also need to consider extracting  3 dimensions from >3D data sets, 
> including collapsing higher number axes - e.g. the collapse of each epoch's 
> frequency axis in the maser movie example above.

This is the general dimensional reduction / projection problem.  We need
to be able to view N-D data in subsets of fewer dimensions.

> The other common requirement (especially for ALMA!) is to convert velocity 
> conventions and to generate a velocity access in response to a rest frequency 
> and velocity convention.

Yes.  How best to handle this requires more thought.  It is tempting
to convert velocity to frequency and work only with frequency at the
lowest level, to avoid having to deal with multiple spectral line rest
frequencies and velocity conventions at all levels.

> Generic data set discovery
> I will try and compare the proposals against some specific use cases and come 
> up with words for this section and elswhere in the light of comments on what 
> of the above is relevant/suitable at this stage.

That would be great, thanks.

 	- Doug



More information about the dal mailing list