SED FITS Serialisation: multi-extension?

Alberto Micol Alberto.Micol at eso.org
Mon Jun 13 07:16:04 PDT 2005


On Jun 13, 2005, at 14:23, Markus Dolensky wrote:

> Hi Alberto,
>
> Would you mind specifying which parts of either the spectral DM doc ...
>
> http://www.ivoa.net/twiki/bin/view/IVOA/IVOADMSpectraWP
>
Yes. Specifically chapter 8 "Serializations".
Sorry not to have mentioned that earlier.

> or the SSA interface doc. ...
>
> http://www.ivoa.net/internal/IVOA/InterOpMay2005DAL/ssa-v090.pdf
>
> ... triggered particular thoughts?
>
> For instance, what is meant by a "VOTABLE accompanying the SED"?
> There is going to be a VOTable query response, but the serialization 
> can either be an XML or VOTable document or a FITS binary table.

As I see it, tell me if wrong, the SSA client receives back a VOTable 
response,
which might point to some FITS file for individual segments, or even for
a bunch of segments at once. The VOTable is the "messenger" and I see it
quite volatile; the associated FITS instead, containing the actual 
data, is
to be stored by the end user for subsequent scientific analysis. And 
I'm afraid
that, as soon as the message is received, the VOTable will be kindly 
moved to .Trash
hence leaving no idea to the end user of which segment had certain 
characteristics;
even the Provenance (in DM terminology) of any segment might go lost.

It is particularly important to remember which reference files were used
for those archives that offer on-the-fly calibration, where the SAME 
dataset
at different times will originate different (better) products as time 
goes by.
If the user loses that info, s/he will not be able to know whether a 
given
product is still the best possible (the "current" one) for a given 
observation.

>
> The scope is 1d spectra and time series. Are you suggesting to expand 
> this for V1.0 of the two docs?

No, I'm not looking for an "expansion", I'm just considering a 
different (let me say "better") serialisation.

>
> Remember, we are trying to serialize a DM. So, are your suggestions 
> aiming at expanding the DM or the way its implemented (serialized)?

The second one.

> > Conclusions: I see only advantages in adopting MEF, am I biased?
>
> Does it mean to give up on serializing a particular DM and to use 
> existing formats instead?

Not at all. We need to agree to a single particular DM, otherwise it 
would be
a mess. when I say MEF I don't just say "any MEF". I'm considering
an MEF that contains what the SED DM imposes, but also allows Data 
Provider's specific info.
Regarding metadata:
  Even the current DM allows for "more keywords" than just the suggested
  standard ones. My idea is to preserve all the metadata that the DM 
already promotes,
  and *at the same time* preserve all the metadata that the data 
provider has
  already published. I can see that only with a MEF (one header per 
extension, i.e. per segment).
/* Note: SED proposed keywords "shall" not clash with the commonly used 
ones.*/

Regarding the data:
  The actual format of the data is NOT to be the original format adopted
maybe 20 years ago by a data provider; that of course needs to be 
standardised,
and the currently SED0.93 proposed solution is to use a binary table 
with
one segment per row.
Instead I'm proposing a MEF to allow for more metadata than just the VO 
ones (see above),
and to be able to cope with other kind of data like the echelle or the 
spectropolarimetry,
which are still to be seen as 1d spectra, but need extra "columns", a 
concept
ruled out by the current SED 0.93. That's why I'm suggesting one binary 
table
per segment; a binary table per segment allows the data provider to fold
into the VO standard all the information judged to be useful.
For example I can imagine useful to associate with the standard 
WAVELENGTH, FLUX
and ERROR other columns like the SUBTRACTED BACKGROUND" etc.
Or, as is the case for spectropolarimetry, to add columns to store the 
Stock's parameters.

And again:
> Does it mean to give up on serializing a particular DM and to use 
> existing formats instead?
At the contrary: I am proposing "one format to rule them all".
And in fact:
> BTW, the next step on the roadmap is to unify access to images, 
> spectra and catalogues by means of ADQL.
also for images we probably need MEF if we want to offer not just the 
image
but also the accompanying weight maps, data quality, etc.
Hence MEF is good for both imaging and spectroscopy.

> This is just to better understand your comments that you thankfully 
> took the time to put down.

Thanks for having taken the time to read me! :-)

> Cheers,
> Markus
>
Ciao,
Alberto

>
> Alberto Micol wrote:
>> Dear SSA/SEDers,
>> I'd like to comment on the serialisation aspects of the protocol
>> which now states that each segment is one row in a fits binary table.
>> In such serialisation the characterisation is left completely to the 
>> VOTable
>> accompanying the SED, since it becomes impossible to characterise 
>> each and
>> every segment with a single header.
>> That is fine IF the user does not care to know the origins of the 
>> segments.
>> (And someone might claim that such a user in not too careful, to say 
>> the least.)
>> My view is that the VO should simplify life of the users in other ways
>> than just stripping off all the information that the data provider, 
>> mostly
>> painfully, put together. :-)
>> My favourite solution would be to adopt a FITS extension for each of 
>> the segments,
>> each extension containing:
>>  -  a header with VO keywords PLUS the original header keywords,
>>  -  a binary table with scalar columns
>> In that way the work of the data provider would be happily 
>> recognised, and
>> the user might be able to find any kind of details regarding any 
>> segment,
>> from the calibration reference files used to calibrate a spectrum 
>> down to
>> the acknowledgment sentence some times buried in some fits COMMENT or 
>> HISTORY keyword.
>> The multiple extension FITS format would also allow to cover the 
>> spectropolarimetry
>> case (currently not supported at all), where for each wavelength
>> the Stokes parameters will be also stored in separate scalar columns.
>> Also, I think that the echelle spectra are causing some troubles to
>> the current format. Each of the multi order spectra should probably 
>> end up
>> into its own extension.
>> Conclusions: I see only advantages in adopting MEF, am I biased?
>> Alberto
>> Aside: With such a format, it would then also be easy to build a SED 
>> On The Fly
>> whereby a SED-OTF tool can compose SSAP queries to some selected 
>> services and
>> come back with a single multi-extension FITS file: it is just matter 
>> of
>> appending any individually ssap-returned FITS file to the 
>> multi-extension file.
>> (Unless I'm wrong, I don't think that the current serialisation allow 
>> a so simple
>> assembling of the fits files).
>>
Alberto Micol
ST-ECF HST Archive Scientist



More information about the dal mailing list