RFC initiated for Simple Spectral Access protocol

Robert Hanisch hanisch at stsci.edu
Mon Jun 25 14:45:06 PDT 2007


A bit on the more complicated issues...

On 6/25/07 6:56 AM, "Jesus Salgado" <Jesus.Salgado at sciops.esa.int> wrote:

> Dear SSAP document authors and DAL chairman,
> 
> Here you can find my analysis of some of the aspects of the current SSAP
> document version.
> 
> Native data:
> The document claims that any native data can be used in the SSAP server
> implementation. In the introduction, we can find:
> 
> [Š]Spectrum datasets may conform to a standard data model defined by
> SSA, or may be native spectra with custom project-defined content[Š]
> 
> [Š]Hence spectra may be actively mediated to the standard SSA-defined
> data model at access time by the service, so that client analysis
> programs do not have to be familiar with the idiosyncratic details of
> each data collection to be accessed[Š]
> 
> However, the document fails to explain how this mediation can be done
> using native data. I explain here some examples of why the document
> cannot be used as a protocol specification for this issue:

I think it is beyond scope for the SSAP document to describe all possible
ways that native spectral data might be mapped onto the spectral DM, and
thus onto the SSAP.   The point of being able to access data in whatever the
native format is is to make it as easy as possible for a provider to provide
_something_.  That native data may or may not be understood by a client.
Users might be completely happy working with native format data with which
they are already familiar.

What the document does, and was intended to do, is to provide a
specification for representing spectra -- following the spectral DM -- so
that a provider who maps their spectra into this specification can then
provide users and client programs with data that can be immediately compared
to other data.  Providers can do this mapping on-the-fly, in advance, or not
at all.  It is up to them.  How they do it is also up to them.

> - The only way to allow ³analysis programs² to ³do not have to be
> familiar with the idiosyncratic details of each data collection² in the
> current SSAP spec is through the use of the output field
> Dataset.DataModel.
> 
> Apart from the examples (Spectrum 1.0, HST-STIS-1.0,Š) there is no
> explanation of how this can work. What does it mean? Do I have to
> publish somewhere the data model to describe my data? What is the format
> of this data model for data description? How is the mediation done for
> the client?

My understanding is that the mediation is done by the service, not the
client.  I suppose at some point in the future we might have a way for a
service provider to simply publish some sort of mapping document, which when
parsed by a client would allow that client to dynamically do the mediation.
Sort of like every speaker of a foreign language offering a listener a
babelfish for that language.  But I thought, for now at least, that we are
defining a way for speakers to provide a universal translation -- an
Esperanto version of a spectrum.   Few or none of us may actually speak
Esperanto, but all will have access to the rules and be able to parse it.

> If there is no standard for that, the only way for a VO client to handle
> new data formats is, either by human interaction (that implies a
> selection by the user on the display, which, sometimes, requires human
> knowledge of the data format and it is not very efficient if you want to
> load many spectra from different sources or if you are using a non-
> interactive application), or by checking the files distributed every
> time a new SSA server is registered and creating a special parser for
> all the file formats (and yes,Š in many occasions you can have more than
> one data format per server)
> 
> This last method (special parsers by server) is how it worked before the
> VO era, and this is something we should prevent.

Agreed.  But SSAP puts the burden on the data provider, not on the client.
And if a provider decides to only deliver its native format, then their data
service may be only of marginal use.
 
> - When Pedro and I designed/implemented the first spectral access server
> in the IVOA, we immediately realized that the self-description of the
> data in the SSA response was crucial. We decided to add some fields that
> give support to spectra in tabular format so the application can deal
> with that data in a basic way without human interaction. We started with
> spectra in tabular data because it is the most standard and it allows an
> easy axis characterization.
> 
> A couple of fields that are used in the present SSA services to
> characterize, in an automatic way, the data model for native data,
> i.e., 
> Dataset.SSA.SpectralAxis,
> Dataset.SSA.FluxAxis
> 
> have disappeared (note that this idea can be easily extended to time,
> errors, etc). These fields were present in version 0.91 but not now.

Aren't these the same as Char.FluxAxis.Ucd and Char.SpectralAxis.Ucd?
Section 4.2.6.6.  Isn't it better to locate this information via UCD than by
column name?  There is perhaps some more subtle point that I am missing
here.

> The reason for this to disappear is claimed to be the following: these
> fields are only useful for spectra in ³tabular form² so it cannot
> be defined what this means.
> I have to remark here that these fields can be used both for FITS
> binary/ascii table format but, as you should know, for spectra in
> VOTable format. In fact, Will O'Mullane developed a SSAP service for
> SDSS data (that was one of the firsts present in VOSpec), and all the
> theoretical spectral servers are in VOTable format. For this format, the
> fields represent the name of the columns were the spectral coordinate
> and the flux is stored. Unfortunately, it looks that the SSA service is
> not longer maintained at JHU after Will´s departure.

Spectrum services are quite alive at JHU, under the care of Tamas Budavari
and Laszlo Dobos.  They have just been working on retooling to support the
proposed SSAP.  They did not try to chase the moving target of the pre-SSA
specification.

> - Of course, these fields are not a solution for FITS 1-D image spectra,
> gifs or other possible more exotic format, but the problem is not the
> fields themselves. We should look for the problem in the protocol
> specification that allows the use of any kind of native format, but it
> does not explain how this format can/should be consumed in clients.
> 
> With these fields, we give support to tabular spectra,Š without them and
> without any other alternative, we do not give support to any native
> format.
> 
> Remember that a protocol specification should be a guide for both server
> and client developers to implement correctly the standards.
> 
> If a new section about native data is added, it can be described
> when/how these fields can be used and some other fields could be added
> for other typical spectra formats (like 1-D image). This is a work to be
> done by the document authors that is not present now. I think this kind
> of info is a ³must² to accept the protocol spec.
> 
> I have to recall that this kind of easy access to native data model
> description allows the implementation of quite general VO applications.
> 
> - The only recommendation present in the document about data format is
> VOTable. I think this is not the most important message to be sent. The
> important one is, it is a lot better to use a format that can be used by
> an application to extract the information, e.g., do not use graphics
> files. I think it is not important the format itself if the servers can
> specify in the output how to parse their native format (like a tabular
> FITS or VOTable). Of course, it is quite difficult to do many science
> using, e.g., gifs.
> 
> Apart from these open issues for native data, there are other points to
> be corrected:
> 
> - There is not any special description of the FORMAT=METADATA paradigm.
> I know the idea is to replace it in the future, but we agreed to use it
> from the time being. To accept a recommendation status for the document,
> the document should be self-consistent. This point is particularly
> important for the theoretical services.
> - POS and SIZE input parameters still remain as a MUST. This is
> contradictory with the inclusion of theoretical spectral in the spec.
> This has been remarked before and the answer was that implement a
> parameter does not mean that this parameter has sense to your data. I
> think this is quite bizarre and it should be clarified (or remove the
> "mandatority" of the parameter)
> 
> I think, taking into account the aforementioned problems, the current
> version cannot be accepted as recommendation. As I have not been invited
> to be part of the collaboration effort to write the document, I base my
> judge as a DAL working group member, author of the first spectral access
> specification, implementor of the first spectra access in the IVOA
> (ISO), author and implementor of the inclusion of spectral theoretical
> services in the VO, VOSpec developer, first author of the SLAP
> specification, ESA Science Archives Data Access Group Leader and IPDA
> (International Planetary Data Alliance) Interoperability Project
> Manager.
> 
> I know the IVOA needs to have standards as soon as possible, but if the
> problems mentioned are not solved, the current state of the
> specification could represent a step back for this international
> project, in particular losing functionalities of 18 SSA + 9 TSAP
> services. In any case, all the problems described are easy to be fixed.

IVOA has NO SSA standard at this point.  It is hard for me to understand
how, with all the work that has gone into SSAP, how extensive is the
metadata, how complete is the data model, etc., it could possibly be
conceived of as a step backward.  The existing pre-SSA and TSAP services
have to be seen as prototypes, and I hope that their developers understand
the need for migration to the SSAP.   The RegistryWG has been discussing how
to allow the pre-SSA services to co-exist for a period of transition.

Bob





More information about the dal mailing list