Putting the pieces together...

Jonathan McDowell jcm at head.cfa.harvard.edu
Sun May 16 18:51:23 PDT 2004


Tom,
 I've tried to go through your scenario to spot where
the observation and spectrum data models play in.
The following are my own opinions and do not represent the
official policy of the IVOA DM group!
 - Jonathan


: Scenario:

: Step 3.
: 
: VSPlot queries the potential matching service one at a time to
: get links to candidate spectral data using the SSA protocol.
: 
:    Issue 3. Need definition of the SSA protocol.

We're working on that! Hopefully imminent.

: 
: Step 5.
: 
: VSPlot determines if the file supports the Spectrum data model.
: If the file does not support this data model it is discarded.
: 
:    Issue 5.A:  How do we find out if a data element supports a given
:    data model?

I propose the following:
 You want to know if a data element supports a given data model M.
 Any given serialization is done in terms of a particular data model, N.
 M is identified in the serialization by the highest level XML tag (raw XML)
 or UTYPE (in VOTABLE).
 There are three possibilities:
  - N = M is the model you are looking for.
  - There is a standard VO method to convert N to M  (e.g. event list to image;
    or image pixel to 1-point spectrum);
    apply it. In other words, there is a constructor for the M object
    which takes N as input.
  - There is no such method; the data does not support M.

The big problem is the middle one: how do you find the N to M method?
At this stage in the VO, there are no such methods and you better hope N=M.
Eventually we will need a service with DM schemata and standard methods
which will do these conversions.

:    Is it required that any file returned by the SSA
:    support the Spectrum data model?  

I would say yes. 

:    If so where do we put the mapping
:    between service types and the data models that the returned
:    data is going to support?

It should presumably be in the registry of services.

:    Issue 5.B: Is there some list of the potential data models that
:    any file might support?

That's related to the first part of 5.A, certainly.

: 
: Step 6.
: 
: VSPlot looks for frame information for this file to confirm
: that it is a spectrum at the appropriate location and in the appropriate
: spectral regime for further processing.
: 
:    Issue 6.A For a FITS file I know how to do this.  I'm much less
:    clear how to do this for arbitrary data returned by an SSA service.
:    Is this a standard method associated with the Spectrum data
:    model that enables me to find this out?  Basically we're asking
:    how we discover the STC information for a given dataset and the
:    comparable spectral info.

Exactly, that's one of the main parts of the Spectrum model (and
the Observation model). You ask the Characterization.
For Spectrum we have, so far, scoped out attributes rather than
methods, on the grounds that exchanging compatibile serializations
is the first challenge. Once you have constructed a Spectrum/SED
object, you can inspect its Characterization to see if it
has spectral characterization data, and if so what the spectral
coordinate is and what the bounds on the spectral axis are.

:    Issue 6.B Is coverage information (spatial and spectral) required to be
:    in a standard format?  If so what is that format?  If not do we have
:    standard conversion services or is it the responsibility of the application
:    to convert?

For SSAP we are proposing to to support at least three serializations:
(1) an XML document instance structured according to a defined schema;
(2) a VOTABLE with certain required fields and appropriate UTYPEs,
also to be defined in the SSAP document; and (3) a newly standardized
FITS binary table serialization of the VO spectrum model.

The hope is that we will provide standard conversion software between
these three formats so that a data provider or an interpreting application
doesn't have to do more than one.

: Step 7.
: 
: VSPlot iteratively uses the standard (in this scenario) getNextElement method defined
: in the spectrum data model to extract data from the file.
: 
:    Issue 7.A  How do we use the data model in real code?  Is the
:    data model associated with a set of Java classes that we can
:    invoke on the data?  If the data model is more than documentation
:    we need to be able to instantiate behavior in some TBD way.


In the first instance the DM is documentation. However the
XML schema representation implies a simple set of Java classes.
The extent to which we standardize a set of Java methods will
become clearer with experience.

:    How do we preserve language independence? (Or do we?)

By defining the model at the UML level. But I'm not sure if
the methods need to be language independent in detail.

:    Issue 7.B Does the data model describe behavior that is defined
:    for the data element or does it indicate that the data is convertible
:    to some fiducial form?

The latter. 

:  If the latter who is responsible for the conversion?

The data provider in the first instance. Initially I think only
the metadata will be converted to the fiducial form for most
archives, although in SSAP we are expecting to be more demanding
and require conversion of the data to one of several standard serializations
(although a simple ASCII table may be one of the approved ones).

: Step 8.
: 
: The user had indicated that they wanted the spectrum to be flux versus
: wavelength.  VSPlot needs to see if it can convert the data extracted
: from the file into those units.  VSPlot looks at the UCDs and Units
: associated with the spectra.  It converts columns to the desired
: units where possible.  Spectra where the data are not convertable
: are discarded.
: 
:    Issue 8.A. How does VSPlot know which column to look at as the flux-like
:    column and which as the wavelength-like column?  It could look through
:    a list of potential UCD's or UTYPE's could be invoked here.   Could
:    the UCD and UTYPE seem to conflict?

The UTYPE immediately specifies which column is the Flux and
which is the SpectralCoordinate. The UCDs of those columns then
determine what kind of flux and what kind of spectral coordinate.

:    Issue 8.B.  How do we do the transformations? Is this VSPlot's responsibility
:    or do we support standard VO transformation services.

For now, it's VSPlot's problem: given the UCDs and Units, do the work.
This particular problem is so ubiquitous that we should later implement
standard transformation software. (Let's say software rather than services;
I don't see invoking a web service for every wavelength-to-frequency
conversion). 


:    Issue 8.C.  This is a hard step.  How does VSPlot know enough to distinguish
:    between raw and background subtracted spectra and the myriad details like that?
:    Is this a characeristic of the flux column or of the entire spectral file?
:    This seems to be where all of the discussion of measurements and quantities
:    needs to provide some benefit to the user.

That particular detail should be a property of the flux column.
What kind of flux am I? I would prefer it if UCDs could carry that
load, but I know many UCD pundits don't want them to be that precise.

In the Observation model, it will also be in the Processing info in the
Provenance: what has been done to this data? But I think we shouldn't
have to parse all of the processing pipeline to be able to answer whether
or not the data are background subtracted. That needs to be attached
to the Flux itself, in the UCD
or in another piece of metadata very similar to a UCD.

There will also be a bit in the Observation model, I hope, to
find the Background Model associated with the spectrum. (which will
usually be a pointer to a rescaled background spectrum).

: Step 9.
: 
: The data is searched for error columns using UCDs and errors bars are computed.
: 
:    Issue 9.A. Errors need to be transformed if the data is transformed, but
:    the transformations can be complex.  Where is this handled?

Eventually, we hope that the Accuracy and Mapping models within Quantity will
do this heavy lifting. We need standard methods to do this. Of course
they won't always work, and they won't be here for a while.
Initially we can at least make sure the needed info to describe the
untransformed errors is present in case VSPlot is smart enough to do the work.

:    Issue 9.B.  How do we aasociate the error columns with the approprite
:    measurements?  Again this seems to be part of the mesurement discussion
:    but I need to know how this model is instantiated for it to be useful.
:    Does it use Groups in VOTables?  Are there other mechanisms?

In VOTable it will use a combination of GROUP and UTYPE. In raw XML
it will be explicit in the object tree.
 
 - Jonathan



More information about the dm mailing list