FITS Serialization Questions for SSAP 0.93

Tue Apr 19 10:26:35 PDT 2005

Hi Doug,

Doug Tody wrote:

>Hi Randy -
>
>  
>
>>On Wed, 13 Apr 2005, Randall Thompson wrote:
>>We are continuing to work on producing SSAP-compatible FITS files
>>for MAST and HST.  A  few questions came up in the process.
>>
>>1) Most of our spectral FITS files have general parameters stored in the
>>primary headers (e.g., DATE) and I was hoping to leave these intact when
>>creating the new versions. Does the SSAP require the specified keywords
>>to appear in the extension headers? Obviously some are required there.
>>    
>>
>
>In general you can have any number of additional keywords in your dataset
>beyond those defined by SSA (or any other DAL interface).  This is true
>for VOTable as well.  Software which only understands the standard data
>model should ignore them.
>
>If we have a name clash between a data model parameter and a general FITS
>keyword, then probably we should change the name of the SSA keyword to
>avoid the clash.  This would avoid the problem you mention, where presumably
>DATE is used in both places and may mean different things.
>  
>

Yes, my concern was with the common names listed in the SED data model paper
such as DATE, OBJECT, INSTRUME, DATE-OBS, APERTURE, EXPOSURE...
These are typically found as keywords in the primary headers of existing 
FITS spectral files.
Of course we could remove them for the VO-compatible files if its necessary.

>All SSA specifies for FITS is the binary table format.  The required
>keywords should be in the binary table.  What "in" means, however, may
>not be clear in the case of a multi-extension FITS file where keywords
>from the primary header can be inherited by an extension.  If the FITS
>container uses inheritance, and the software used to extract binary table
>extensions can resolve the inheritance and populate the required keywords,
>then I don't see any reason why you could not use inheritance within the
>FITS files in your archive.  It may be necessary however to resolve the
>extension keywords before returning an individual spectrum to a VO client.
>  
>
>Note - while you may want to adjust whatever you use for a FITS spectral
>data format in your archive, VO assumes that this will undergo a translation
>when you return a dataset to a client via a VO interface!  You should
>assume that the VO middleware will evolve, so whatever is chosen now
>for an archive data format will likely diverge from what is used in the
>VO in the future.  On the other hand, the active data model mediation
>which occurs at access time makes it easy to resolve the differences,
>or support different over the wire formats such as FITS and VOTable.
>
>
>  
>
I understand the advantages (and disadvantages) of creating files 
"on-the-fly", but I am not sure
why this keeps coming up in discussions of the data model. Why does the 
VO "assume"  a
translation will be required?  If we create disk files offline in a 
batch mode, we can still change
the conversion software in the future and recreate them if the SSAP 
changes. Is there something
about the SSAP interface that will "require" data centers to make 
customized files for the VO?

>>2)  We have at least 3 missions containing echelle spectra. Typically
>>each spectral order has a different number of data points. I assume
>>the recommended way to store these would be as multiple rows in a binary
>>table using variable length arrays.  Is this correct? Do you recommend any
>>format which would not require the use of variable length arrays (other
>>than writing them to separate files)? I was wondering for example about
>>using padded zeroes to store multiple orders in a single binary table,
>>or having multiple binary table extensions, one for each spectral order.
>>    
>>
>
>We have updated the SSA data model specification for FITS to indicate that
>either fixed or variable length arrays may be used.  Variable length arrays
>are preferred for cases where a great deal of table space would otherwise be
>wasted, for example when the rows of the table are SED segments with wildly
>different element lengths (note all segments are stored in the same table).
>If the dataset is a simple spectrum or time series, then it probably makes
>more sense to use fixed length arrays.
>  
>
I am not sure you explained how we can store echelle spectra without 
variable length arrays unless
you are saying that it is not possible. Is this correct?

>Again, this is for when FITS is used as the *wire* format.  You can do
>whatever you want for an archival format.  In general what a SSA service
>should probably do, is create a SSA data object in memory from whatever is
>stored in the archive, and return this to the user in whatever format the
>client requested (within the limits of what is supported by the service).
>
>  
>
If the SSAP interface is similiar to the SIAP interface, then perhaps 
you are imagining something
like a SSAP cutout service where the user can specify the wavelength 
range of the retrieved data?
I could see that this would be useful and require VO middleware like you 
described. But I hope
you offer a mode that does not require so much overhead (as the SIAP 
does now).

This reminded me of another question (posed by Myron Smith) regarding 
splice points. Echelle
spectra frequently overlap in wavelength and it's not always clear how 
the orders should be
merged. If the SED data model allows overlapping segments, or if we 
create merged spectra,
is there a provision for how to describe the splice points?

>  
>
>>3) I know the SED data model paper is not yet complete, but it would
>>be useful to have more information on defining the listed FITS keywords
>>(e.g., the curation and WCS entries).
>>    
>>
>
>I agree, it is still pretty rough at this point.
>
>	- Doug
>
>  
>

Thanks for all your help.
Randy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dal/attachments/20050419/0d733aac/attachment-0001.html>