Spectra DM for theoretical spectra?

Wed Jun 3 01:55:39 PDT 2009

Rick Wagner wrote:
> 1) What we are calling Characterization in the simulation data model, 
> is restricted to only provenance information. For example, the star 
> formation rate or RMS Mach number may be represented, and these are 
> physical properties measure from the data (in my cases).

I'd like to stress what already mentioned by Francois: the importance of 
using the correct vocabulary.
I take the example here above to illustrate the problem (sorry Rick, 
nothing against you,
yours just happens to be the last email about this, see also Carlos' 
post): in one sentence three different
concepts are used somewhat intermingled: Characterisation, Provenance, 
and measured properties.

Given that in the IVOA world (or 'vosphere' to take Sebastien's nice 
expression) both Characterisation and  Provenance DMs
already exist (one already recommended, the other being worked out), I 
would ask Theory to try to avoid any confusion
and not to use Characterisation what we are already used to call 
Provenance, etc.

I'm trying to depict the problem here, and later I'll make some 
suggestions on the vocabulary
and on a possible way to solve the issue, that is, enabling a full 
description of a theoretical spectrum.
Please bear with me... I need to clarify my terminology to get understood...

Observation flow:
Real Universe --> Observation --> Telescope --> Data/Metadata --> 
Measurements

e.g.
input: a spectrum of star is observed by a camera and from it
output: the metallicity, log g, eff. temperature are obtained/measured;

Simulation flow:
in the Theoretical World, it is conceptually reversed, the 
"Measurements" actually
become the Input parameters in a flow that looks very much reversed:

Input parameters --> Virtual Telescope --> Simulation --> Modelled Universe

e.g.
Input parameters are metallicity, log g, eff temperature
output: a modelled spectrum is obtained.

It is a reversed flow, it is like mathematically inverting a function:
   function(domain)  ->  image
   inverse function(image) -> domain

the "image" of the function called "observing"
becomes the "domain" of the function called "simulating", and viceversa.

So far, CharDM, ProvenanceDM and SpectrumDM have focused their attention 
onto the HOW,
and have intentionally avoided to describe the WHAT is observed.
It would be too difficult to describe anything that could be observed [a 
star,
a galaxy, a cluster, a cloud, a bird, a girl (try to model it! ;-) ), etc.]

Maybe some confusion raises from the fact that the WHAT
of the real world becomes the Input parameters in the Simulated World
(e.g. the eff temp, the log g, the metallicity of a star).

The distinction though is that in the Real World the WHAT is in the end 
a measurement, an
estimate of the Truth, and hence has got an associated error, while in 
the Theoretical World
the Input parameters _are_ the Truth (and error is virtually zero).
If TheoryWG comes up with a good set of models for the Input Parameters, 
please bear in mind
the above, because the same model, with some care, could be re-used to 
model the real world's WHAT!

Coming back to terminology/vocabulary:

Carlos Rodrigo Blanco wrote:
> I'm not so sure that we are talking about provenance and not 
> characterization.
>
> I feel that the sentence "description of a dataset or an observation 
> in the Physical parameter space of the data" describes precisely what 
> we are talking about.
>
> The physical n-dimensional space where a theoretical spectra is 
> located is a space parametrized by some parameters as Teff, Logg, 
> metallicity, etc.
>
> when I go to the characterization data model I read:
>
> ---
> This document defines the high level metadata necessary to describe 
> the physical parameter space of observed or simulated astronomical 
> data sets, such as 2D-images, data cubes, X-ray event lists, IFU data, 
> etc..

Characterisation DM: Care was taken *not* to describe "what" was 
observed/simulated,
CharDM describes the N-dimensional space subtended by an 
observation/simulation product
specifying which part of the N-dimensional space [whose axes are the 
_data_ axes: space, time,
wavelength...] was covered, with which sampling, and which resolution, 
all at various levels of
detail (from an indicative number (location) down to finer details like 
detector sensitivy,
transmission curve, etc).

The ProvenanceDM describes the process that resulted in an 
observation/simulation
(e.g., among other things, the telescope/instrument configuration, but 
also the PI program, etc.)

Rick Wagner wrote:
> However, there are some properties, such as resolution (spatial, or 
> frequency) which are known beforehand. For this reason, there is a 
> boolean "a priori" attribute on the Characterization class. This 
> distinction may need to be made more clear, in order to align our 
> model with the Characterization Data Model.

The SpectrumDM combines a number of "How"DMs (CharDM, ProvenanceDM, 
Curation, etc.)
though it allows some little digression into the realm of "Derived 
Quantities", that is
quantities that are measured out of the described spectrum, and into the 
realm of what
the PI already knows about the target ("a priori"). For example, for 
REDSHIFT,
two different utypes and FITS keywords exist at this effect, one for the 
"a priori" knowledge
one for the "a posteriori" measured quantity).

As many have already stated I also think that the Input parameters are 
the equivalent
to the telescope/instrument settings, they are the Virtual Telescope 
settings, and
as such the word Characterisation should strongly be avoided, in favour 
of Provenance.

A possible recipe to solve the problem:

Given the multitude of simulators that create spectra (stellar 
atmospheres, galaxy clusters,
bl lacs, etc.) it is not possible to have one ProvenanceDM that covers 
them all; I think
that each of those sets of Input Parameters should be modelled separately;
then it would be nice to enable ProvenanceDM to reference anyone of them;
similary SSAP (or its subset TSAP) could easily make use of all that.

Sorry for the long email, but I think it is useful to clarify things
(and, viceversa, please clarify things that I might have gotten wrong)

Alberto