FITS Serialization Questions for SSAP 0.93

Tue Apr 19 08:41:13 PDT 2005

Hi Randy -

> On Wed, 13 Apr 2005, Randall Thompson wrote:
> We are continuing to work on producing SSAP-compatible FITS files
> for MAST and HST.  A  few questions came up in the process.
> 
> 1) Most of our spectral FITS files have general parameters stored in the
> primary headers (e.g., DATE) and I was hoping to leave these intact when
> creating the new versions. Does the SSAP require the specified keywords
> to appear in the extension headers? Obviously some are required there.

In general you can have any number of additional keywords in your dataset
beyond those defined by SSA (or any other DAL interface).  This is true
for VOTable as well.  Software which only understands the standard data
model should ignore them.

If we have a name clash between a data model parameter and a general FITS
keyword, then probably we should change the name of the SSA keyword to
avoid the clash.  This would avoid the problem you mention, where presumably
DATE is used in both places and may mean different things.

All SSA specifies for FITS is the binary table format.  The required
keywords should be in the binary table.  What "in" means, however, may
not be clear in the case of a multi-extension FITS file where keywords
from the primary header can be inherited by an extension.  If the FITS
container uses inheritance, and the software used to extract binary table
extensions can resolve the inheritance and populate the required keywords,
then I don't see any reason why you could not use inheritance within the
FITS files in your archive.  It may be necessary however to resolve the
extension keywords before returning an individual spectrum to a VO client.

Note - while you may want to adjust whatever you use for a FITS spectral
data format in your archive, VO assumes that this will undergo a translation
when you return a dataset to a client via a VO interface!  You should
assume that the VO middleware will evolve, so whatever is chosen now
for an archive data format will likely diverge from what is used in the
VO in the future.  On the other hand, the active data model mediation
which occurs at access time makes it easy to resolve the differences,
or support different over the wire formats such as FITS and VOTable.

> 2)  We have at least 3 missions containing echelle spectra. Typically
> each spectral order has a different number of data points. I assume
> the recommended way to store these would be as multiple rows in a binary
> table using variable length arrays.  Is this correct? Do you recommend any
> format which would not require the use of variable length arrays (other
> than writing them to separate files)? I was wondering for example about
> using padded zeroes to store multiple orders in a single binary table,
> or having multiple binary table extensions, one for each spectral order.

We have updated the SSA data model specification for FITS to indicate that
either fixed or variable length arrays may be used.  Variable length arrays
are preferred for cases where a great deal of table space would otherwise be
wasted, for example when the rows of the table are SED segments with wildly
different element lengths (note all segments are stored in the same table).
If the dataset is a simple spectrum or time series, then it probably makes
more sense to use fixed length arrays.

Again, this is for when FITS is used as the *wire* format.  You can do
whatever you want for an archival format.  In general what a SSA service
should probably do, is create a SSA data object in memory from whatever is
stored in the archive, and return this to the user in whatever format the
client requested (within the limits of what is supported by the service).

> 3) I know the SED data model paper is not yet complete, but it would
> be useful to have more information on defining the listed FITS keywords
> (e.g., the curation and WCS entries).

I agree, it is still pretty rough at this point.

	- Doug