Spectrum data model

Doug Tody dtody at nrao.edu
Wed Sep 13 11:09:37 PDT 2006


Hi All -

There have been various discussions on the data model / file format
issue, but to keep it simple I will respond to Mark's original message.

On Wed, 13 Sep 2006, Mark Taylor wrote:

> [...] this relates to a point that I raised with Markus Dolensky last
> week and he forwarded to Doug concerning SSAP and serialization
> formats.  Since it's come up here, I'll shove my oar in.
>
> My problem is that the information returned from an SSAP query
> gives the serialization MIME type, but no more - as you point out
> above, the fact that a spectrum is encoded as FITS could cover any
> number of specific serialization formats.  So a client trying
> to make sense of a spectrum returned from SSAP, which only has
> the MIME type got from the Access.Format response field to tell
> it what kind of data is at the other end of the Access.Reference,
> has an unneccessarily difficult job, in that it really has to
> examine the data itself to work out what the serialization format is
> (and in doing that it may end up downloading a large data file only
> to find out that it is in a format that it can't understand).
>
> Possibly the intention is that an SSAP Access.Format of application/fits
> means the data is in the FITS format defined in the Spectral DM
> document (ditto for application/x-votable+xml, application/xml),
> but I can't see this stated explicitly anywhere.
>
> Otherwise, it seems to me that what is called for is an additional
> field in the SSAP response which names the specific serialization
> format, if known.  This would require assigning some sort of name
> to the XML, FITS and VOTable formats defined in the Spectral DM
> document (presumably a URI of some sort).

This is primarily a query matter whereas Spectrum is a dataset data
model, hence we are getting into issues here which aren't addressed by
the Spectrum model alone.

We distinguish between the data model and the data format or
serialization.  Both are described in the query response.  Since the
same data object, conformant to the Spectrum data model, may be viewed
via various formats/serializations, it is not clear whether the data
model itself should specify the serialization; my view has always been
that this best done externally, e.g., in the access protocol.

What we currently have in the access protocol in this area:

	Dataset.Type		# Spectrum, TimeSeries, etc.
	Dataset.DataModel	# Data model, e.g., "Spectrum V1.0"
	Access.Format		# File format (MIME type)

If the DataModel is "Spectrum" then we have a fully VO-compliant dataset.
(Yes, services will need to perform a conversion on the fly to return
a dataset compliant with the VO Spectrum data model.)

If instead the service returns native project data (typically different
for every data collection/mission/instrument) then Dataset.DataModel
should identify the specific project data model for the data to be
returned.  This is the "pass-through" mechanism for accessing native
project data via an SSA query interface.  An application doesn't have to
scan the data file to determine what it contains, this is specified
directly by the dataset Type and DataModel.

The data format or serialization is (in principle at least) independent
of the data model.  This is true for Spectrum but in general will not be
true for native project data, where there is typically only one format.
Currently, the file format is specified by its MIME type.

	- Doug



More information about the dm mailing list