RFC initiated for Simple Spectral Access protocol

Mon Jun 25 03:56:16 PDT 2007

Dear SSAP document authors and DAL chairman,

Here you can find my analysis of some of the aspects of the current SSAP
document version.

Native data:
The document claims that any native data can be used in the SSAP server
implementation. In the introduction, we can find:

[…]Spectrum datasets may conform to a standard data model defined by
SSA, or may be native spectra with custom project-defined content[…]

[…]Hence spectra may be actively mediated to the standard SSA-defined
data model at access time by the service, so that client analysis
programs do not have to be familiar with the idiosyncratic details of
each data collection to be accessed[…]

However, the document fails to explain how this mediation can be done
using native data. I explain here some examples of why the document
cannot be used as a protocol specification for this issue:

- The only way to allow “analysis programs” to “do not have to be
familiar with the idiosyncratic details of each data collection” in the
current SSAP spec is through the use of the output field
Dataset.DataModel. 

Apart from the examples (Spectrum 1.0, HST-STIS-1.0,…) there is no
explanation of how this can work. What does it mean? Do I have to
publish somewhere the data model to describe my data? What is the format
of this data model for data description? How is the mediation done for
the client?

If there is no standard for that, the only way for a VO client to handle
new data formats is, either by human interaction (that implies a
selection by the user on the display, which, sometimes, requires human
knowledge of the data format and it is not very efficient if you want to
load many spectra from different sources or if you are using a non-
interactive application), or by checking the files distributed every
time a new SSA server is registered and creating a special parser for
all the file formats (and yes,… in many occasions you can have more than
one data format per server) 

This last method (special parsers by server) is how it worked before the
VO era, and this is something we should prevent.

- When Pedro and I designed/implemented the first spectral access server
in the IVOA, we immediately realized that the self-description of the
data in the SSA response was crucial. We decided to add some fields that
give support to spectra in tabular format so the application can deal
with that data in a basic way without human interaction. We started with
spectra in tabular data because it is the most standard and it allows an
easy axis characterization. 

A couple of fields that are used in the present SSA services to
characterize, in an automatic way, the data model for native data,
i.e., 
Dataset.SSA.SpectralAxis, 
Dataset.SSA.FluxAxis 

have disappeared (note that this idea can be easily extended to time,
errors, etc). These fields were present in version 0.91 but not now. 

The reason for this to disappear is claimed to be the following: these
fields are only useful for spectra in “tabular form” so it cannot
be defined what this means. 
I have to remark here that these fields can be used both for FITS
binary/ascii table format but, as you should know, for spectra in
VOTable format. In fact, Will O'Mullane developed a SSAP service for
SDSS data (that was one of the firsts present in VOSpec), and all the
theoretical spectral servers are in VOTable format. For this format, the
fields represent the name of the columns were the spectral coordinate
and the flux is stored. Unfortunately, it looks that the SSA service is
not longer maintained at JHU after Will´s departure.

- Of course, these fields are not a solution for FITS 1-D image spectra,
gifs or other possible more exotic format, but the problem is not the
fields themselves. We should look for the problem in the protocol
specification that allows the use of any kind of native format, but it
does not explain how this format can/should be consumed in clients. 

With these fields, we give support to tabular spectra,… without them and
without any other alternative, we do not give support to any native
format.

Remember that a protocol specification should be a guide for both server
and client developers to implement correctly the standards.

If a new section about native data is added, it can be described
when/how these fields can be used and some other fields could be added
for other typical spectra formats (like 1-D image). This is a work to be
done by the document authors that is not present now. I think this kind
of info is a “must” to accept the protocol spec.

I have to recall that this kind of easy access to native data model
description allows the implementation of quite general VO applications.

- The only recommendation present in the document about data format is
VOTable. I think this is not the most important message to be sent. The
important one is, it is a lot better to use a format that can be used by
an application to extract the information, e.g., do not use graphics
files. I think it is not important the format itself if the servers can
specify in the output how to parse their native format (like a tabular
FITS or VOTable). Of course, it is quite difficult to do many science
using, e.g., gifs.

Apart from these open issues for native data, there are other points to
be corrected:

- There is not any special description of the FORMAT=METADATA paradigm.
I know the idea is to replace it in the future, but we agreed to use it
from the time being. To accept a recommendation status for the document,
the document should be self-consistent. This point is particularly
important for the theoretical services.
- POS and SIZE input parameters still remain as a MUST. This is
contradictory with the inclusion of theoretical spectral in the spec.
This has been remarked before and the answer was that implement a
parameter does not mean that this parameter has sense to your data. I
think this is quite bizarre and it should be clarified (or remove the
"mandatority" of the parameter)

I think, taking into account the aforementioned problems, the current
version cannot be accepted as recommendation. As I have not been invited
to be part of the collaboration effort to write the document, I base my
judge as a DAL working group member, author of the first spectral access
specification, implementor of the first spectra access in the IVOA
(ISO), author and implementor of the inclusion of spectral theoretical
services in the VO, VOSpec developer, first author of the SLAP
specification, ESA Science Archives Data Access Group Leader and IPDA
(International Planetary Data Alliance) Interoperability Project
Manager.

I know the IVOA needs to have standards as soon as possible, but if the
problems mentioned are not solved, the current state of the
specification could represent a step back for this international
project, in particular losing functionalities of 18 SSA + 9 TSAP
services. In any case, all the problems described are easy to be fixed.

Best Regards,
-- 
Jesus J. Salgado

ESAC Science Archive Team
e-mail: Jesus.Salgado at sciops.esa.int
Tel + 34 91 8131271

European Space Agency/European Space Astronomy Centre
VILLAFRANCA Satellites Tracking Station
P.O. Box 78
E-28691 Villanueva de la Cañada
MADRID - SPAIN