spectral data representations for the NVO

Frank Valdes valdes at noao.edu
Mon Apr 28 14:58:34 PDT 2003


    Incorporating Spectra in the Next Phase of the Virtual Observatory
		    Francisco Valdes, NOAO
			April 28, 2003


Abstract

For the most part astronomical images and spectra are both projections of a
four dimensional observational parameter space.  The four parameters are
two celestial coordinates, photon energy, and time.  The suggestion
presented here is that accessing these common forms of astronomical
observational data should be based on this four dimensional parameter
space.  The prototype for this would be a fairly direct extension of the
current "Simple Image Access Prototype" (SIAP) specification from the
current two dimensional parameter model.


Introduction

The two most common types of astronomical observations are images and
spectra.  Because spectroscopic instrumentation is generally different from
imaging, though many spectrometers/spectrographs include imaging modes, an
artificial distinction is made between images and spectra.  Also the way
spectral information is sometimes obtained by multiplexing photon energies
into spatial positions on a detector confuses the issues.  These factors
often lead to separate treatment for the two.

Conceptually, a majority of astronomical observational data consist of
measurements of photons arriving from a particular direction on the sky,
with a particular energy, and at a particular time.  Sometimes this
information is recorded directly in so-called event lists.  Other times
the events are binned to produce an array or raster which is implicitly
or explicitly four dimensional.

This picture of astronomical observations leads us to consider this class
of data as defined by the four parameters of celestial position, energy,
and time.  Note we refer to the spectral information in terms of photon
energy though wavelength or frequency could also be used as appropriate.
The discussion which follows does not expand on the time aspect of the
parameter space.  So the approach described here could also be defined to
consider a three dimensional space without the time element.  Time was
included, however, because it is a clearly identifiable aspect of the
observation model.

My vision for the VO access layer to observational astronomical data is
that the instrumental signatures and characteristics are removed by the
provider apart from the resolution or binning.  This is an important
requirement for dealing with spectra since the raw instrumental data can be
in quite complex formats with spatial and spectral information multiplexed
onto a detector.

The question addressed here is whether spectral data can be easily
incorporated in the current VO developmental framework.  In particular,
whether the "Simple Image Access Prototype Specification" (SIAP) might be
extended or if another prototype is needed for spectra.  In order to
consider a modest extension of SIAP, which is about raster data, the
observational data is also reqiured to be binned in the four dimensional
parameter space.  Event data can be accessed through such a model by
requiring the data provider bin the data for VO access through such a
raster protocol.

While we talk about a raster this does not mean the sampling is uniform in
any physical units.  What is meant by a raster is that a set of photon
values (such as flux or counts) is provided with a logical index.
Conversion from the logical index to a physical four dimensional parameter
space coordinate is the provence of the world coordinate system (WCS) and
of the discovery metadata.

Considerable thought has gone into expressing the relationship between
logical indices and world coordinates in the FITS WCS methods.
An important recent development is the proposal to allow lookup tables
as part of the relationship.  This is significant for spectra, particularly
1D projections, because the energy coordinates are sometimes provided
in a lookup table.  In other words, a common form of spectral data held
by data providers is a table of photon fluxes with associated energy.

While an lookup table was conceived of for spectra the concept can be
applied more broadly to the four parameter raster.  What this allows is
sparse sampling from the raster.  This might be relevant to some types
of spectral data where the spatial sampling is sparse and somewhat
random.  An example of this is multiobject spectroscopy where sources
are targeted with fibers.  Whether the instrumentally extracted spectra
should be considered separate rasters for the purposes of VO access
is an interesting point of discussion.


Data Access, Data Models, and Data Formats

A key distinction that needs to be reiterated in discussing data within
the virtual observatory context, is between data access/requests, data
models, and data formats.  We raise this here with regard to spectra
because it is easy to end up mixing all three.  The discussion here is focused
on data query and access for spectra.  Discovering and requesting data is
largely independent of the data format which is ultimately provided as the
result of a request.

In the context of SIAP, that specification mandates certain types of data
formats for retrieval.  The primary science type is FITS so in terms of
considering an extension of SIAP for spectra this implies a FITS data
format.  There are a variety of ways spectra can be included in FITS.  This
is the subject of the discussion by Busko and I can provide a similar
proposal for general spectral formats.  A discussion of the best few
formats for spectra might be diverse at first but I believe it would not be
hard to converge on a few that are FITS based and general enough; keeping
in mind that the vision is access to instrument independent science spectra
and not complex multiplexed data acquisition formats.

A key factor for the science formats is that they include a WCS.  The FITS
WCS, including lookup tables, has been developed to the point that it
provides fairly complete descriptions for images and spectra.  Note that
the non-linear distortion feature is still a proposal by Valdes and
others.

Projections of 4D Parameter Space -- Images and Spectra

This section identifies the obvious projections that constitute
observational images and spectra.  The first assertion is that for such
observational data there is one time value corresponding to a
representative instant in the observation (the start or midpoint).  The
second assertion is that an image is a spectrum with one energy point.
Both the time and energy points have metadata to define the point such as
exposure time and filter bandpass.

Spectra come in several flavors.  First, by definition, these have more
than one sample in energy.  The most complete spectral type is the
so-called data cube.  Data cubes include multiple raster elements in both
celestial coordinates and in energy.  These are generated by radio
spectrometers as well as Fabry-Perot and Integral Field Units at higher
energies.

Slit spectroscopy has been the mainstay of optical astronomy.  These are 2D
rasters with one celestial and one energy dimension.  The celestial
dimension requires a higher dimensional WCS in the metadata to convert the
spatial logical index to a curve in two dimensional celestial space.  There
are FITS WCS proposals for how this can be done in a general fashion.

Finally, fiber or spatially integrated spectra have just one point in
the spatial parameters.


Extensions of the SIAP Specification

The conclusion of the discusion presented here is that (raster indexed)
spectral data should be incorporated into the developing VO infrastructure
by avoiding any artificial distinction between images and spectra.
Therefore, one should extend the SIAP specification rather than invent a
new mechanism for spectra.  The concern about having the word "image"
in SIAP is recognized but not discussed here.

This discussion is not intended as a proposal but simply to explore how
spectra might be incorporated within the SIAP specification.  Many details
would have to be worked out.

In a broad review of the SIAP specification it appears that the main
changes required are to restate the purpose to include a more general
concept of "image" as potentially one to four dimensional data formats and
to extend the query and metadata fields appropriately.  In particular, the
discussion of queries would be expanded to a search for data in a given
region of the sky, over a region of photon energies, and over a period of
time.  Then the region of interest (ROI) specified by the POS and SIZE
fields would be extended to include four values rather than two.  As
details there might be special values or definitions about the
interpretation of missing fields.

The main question to be resolved is whether and how the query syntax can
select only images and spectra in the usually understood sense.  One way
might be specifying the number of elements along the energy axis; i.e. a
value of 1 is an image.  But probably a better way to make this common
distinction would be a parameter similar to INTERSECT.  A parameter such as
TYPE with values "IMAGE", "SPECTRUM", or "ANY" would place certain
requirements on the requested data content.  IMAGE would have more than one
element along each spatial dimension and only one element of energy and
time.  A spectrum would be data with multiple samples in energy.  Other
choices might restrict the request to 1D, 2D, or 3D spectral data.


Relationship with the SAO/CfA SIAP Proposal

The ideas presented here are similar in many respects to "SIAP Extension
RFC and Draft Specification, Part 1: Quantities and Coordinates" by Steve
Lowe.  The main difference is that the Lowe paper attempts to be more
general.  The approach suggested in this 4D discussion is intermediate.
Images and spectra are treated as projections of an extended
parameter space which can be handled by an extension of the SIAP
methodology.   However, the 4D proposal is to not make the extension too
general, and hence, complex.  Instead, simply add two parameters to cover
the vast majority of observational parameter space of interest to
astronomers.  Furthermore, adopt something like the degree limitation on
units for celestial coordinates to restrict the energy and time
specifications.  Transformation between units would be done by the
client interface and by the data provider.


Conclusion

This discussion suggests that observational images and spectra be treated
as aspects of a four dimensional observation model.  This assumes the
raw instrumental observations have been converted to raster sampling
along 1 to 4 axes so that spatial multiplexing or other quirks of raw
spectral data are eliminated.  Such data would be accessed by a fairly
simple extension of SIAP to a four dimensional parameter space.  There
might be a new paraemter to allow restricting requests to "images" or
"spectra" separately as well as getting information and data about holdings
which include both images and spectra in some (4D) region of interest.



More information about the dal mailing list