Spectral DM document update

Doug Tody dtody at nrao.edu
Mon Oct 9 09:30:12 PDT 2006


Hi All -

I agree with Jonathan that it would be good to hear from some other folks,
in particular anyone involved in writing spectral analysis applications.
Which is more natural for analysis, an array of points, or separate
wavelength (spectral coordinate), flux, and error vectors?


On Mon, 9 Oct 2006, Paul Harrison wrote:
> Still not convinced this is an argument for making the XML serialization more 
> vector-like, its all depends on what your idea of an array is - you could 
> still have all the abcissae,ordinate,errors etc.  for a single spectral point 
> localized in memory and then a whole spectrum
>
> * in a modern OO language would be an array of "point objects"
> * in a C like language it could be an array of structs
> * even in FORTRAN it could be one a big array, just with a index increment of 
>> 1 to obtain the next element of the same physical meaning e.g. Flux 
> (assuming that all the values were stored as the same basic data type)

Of course we can do this, but for most analysis applications I think
the first thing applications would do with an array of point structures
is extract the data vectors.  There are many reasons for this, e.g.,
plotting packages will want vectors, IDL and Python and IRAF (etc.) want
vectors, vector-based code is generic and independent of representation,
whereas a custom point structure will require specialized code, etc.
Is there some reason that "modern" languages don't like data vectors?


On Mon, 9 Oct 2006, Roy Williams wrote:
> On Oct 9, 2006, at 7:53 AM, Doug Tody wrote:
>> we are talking about bulk data arrays which can be thousands
>> of points in length.  Ignoring for the moment efficiency concerns, it
>> is just a whole lot easier to deal with them as a single vector-valued
>> data element.
>
> I seem to recall the same discussion with VOTable about 3 years ago -- why 
> not transpose the table, read it column by column so that it is more 
> efficient for the computers. Does anyone else remember the discussion, the 
> reasons why this was not adopted?
>
> If this is to be revisited, then I suggest doing so in the VOTable WG, since 
> the idea of transposing a table is much more general than representation of 
> spectra.

Both representations are possible; both are vector-based if we consider
a table column to be a vector (which we can as table packages usually
contain a capability to extract a table column as a vector).  In fact, 
both representations are currently used in SSA.  The VOTable representation
stores one spectrum per table, with the data vectors in table columns
(or PARAMS if they are scalar).  The FITS bintable representation uses
one table row per spectrum, mainly for compatibility with existing 
FITS-based spectral formats and packages, which also take this approach.
The spectra-per-row approach also permits multiple spectra per table,
which is common for storage of instrumental data.

 	- Doug



More information about the dm mailing list