[SPECTRA] Some thoughts on the spectral model
Jonathan McDowell
jcm at head-cfa.cfa.harvard.edu
Wed Sep 17 10:40:22 PDT 2003
As those of you who were at the recent NVO meeting know, I have been
taking a look at the datasets referred to in the spectral use survey led
by Doug Tody. Here I note a grab-bag of things that came out of this
study we should keep in mind for the spectral data model. In general,
the model for spectra presented earlier covers most of these cases, but
doesn't have a good place to put things like exposure and response
information.
1) The importance of describing clearly the observable (pixel value)
was reemphasized by the study. More cases of observables
to add to the list:
- Antenna temperature (e.g. SWAS)
- Ratio of two objects (e.g. Arcturus over telluric)
(Does this need extra metadata?)
2) We also need, separately I believe, to describe corrections
made to the observable that do not change its units or overall
interpretation:
- absorption (atmosphere, galactic, ...)
- fit, model,..
- continuum-subtracted
- lines removed
3) I note that spectra versus wavenumber are in the archives
(e.g. Arcturus) so we do need to make sure we support this
in the bandpass/frequency object.
4) NOAO Arc Lamp spectra: what metadata should these have to
characterize them? In general, what metadata should calibration
data have? Should this also handle real observation data that
are used as calibration (e.g. a spectrum of a standard star
might also have calibration metadata to say that 'this is a
template for a K2IV star' as well as having the usual this-is-just-data
metadata. An arc lamp should have metadata saying 'this is a KPNO
HeNeAr lamp covering the following range'. How structured should
this metadata be?
5) A lot of archives include spectral line identification tables,
with catalogs of lines each with parameters like EW. I propose
- as I think we agreed at Cambridge - that such data does not
fall under the purview of 'spectrum', it is a separate object
- possibly a special case of, or spectral analog to, 'source list',
possibly with a standard method to convert it to a spectrum object.
The counter position would be to say that it is just a funny
way to store a spectrum, with no continuum, but I think the
extra metadata associated with specific atomic lines argues
against this.
6) Some data are stored with several different spectra versus
the same wavelength axis, e.g. a table with 4 columns,
lambda, spec1, spec2, spec3
In some cases the spectra refer to different objects,
in others to different corrections (data, error, bitmask),
and often to different observables (data1, data2, ratio).
Should the spectral model treat these as a spectral array
(a vector-valued spectrum with a single wavelength axis)
or as an array of spectra (n-1 different spectrum objects,
replicating the wavelength info for each one)?
- In the case of error and bitmask, these are tightly related
to the actual data column and will have explicit places to
live in the model
- In the case of two objects and their ratio, I believe
interoperability will be better served by making data providers
expose these to the VO as three different spectra. The
cost is that applications will have to do work to realize that
the spectra have compatible wavelength axes. But particularly
in the case of the ratio, where the units are different
(dimensionless instead of flux) handling them as a vector could be
messy.
7) Although not different from the data model's point of view,
I'm particularly concerned by the SDSS spectral FITS files.
These have an n x 4 FITS image, where n is the number of
wavelength points and the 4 layers are different observables
(data, continuum-subtracted data, error, mask). This breaks
the FITS paradigm (e.g. BUNIT is meaningless since the 4th
plane has different units from the other 3) while using four
n x 1 images would have been perfectly legal FITS allowing
use of meaninful metadata. SDSS is by far not the only offender
in this respect. Data providers will haave to take particular
care in describing such datasets to the VO, and I suspect the
data will have to be reformatted prior to transfer along the
wire if generic VO tools to be developed to operate on
spectra are to have a chance at swallowing these data.
- Jonathan
More information about the dm
mailing list