Spectra DM for theoretical spectra?

Carlos Rodrigo Blanco crb at laeff.inta.es
Tue Jun 2 06:12:44 PDT 2009

I'm not so sure that we are talking about provenance and not 

I feel that the sentence "description of a dataset or an observation in 
the Physical parameter space of the data" describes precisely what we are 
talking about.

The physical n-dimensional space where a theoretical spectra is located is 
a space parametrized by some parameters as Teff, Logg, metallicity, etc.

when I go to the characterization data model I read:

This document defines the high level metadata necessary to describe the 
physical parameter space of observed or simulated astronomical data sets, 
such as 2D-images, data cubes, X-ray event lists, IFU data, etc..
The Characterisation data model is an abstraction which can be used to 
derive a structured description of any relevant data and thus to 
facilitate its discovery and scientific interpretation. The model aims at 
facilitating the manipulation of heterogeneous data in any VO framework or 
A VO Characterisation instance can include descriptions of the data axes, 
the range of coordinates covered by the data, and details of the data 
sampling and resolution on each axis.

and I also feel that this quite describes what we are talking about.

But I must say that I have never really understood what "provenance" is 
and I am not able to find a document explaining it (I've attended some 
mailing list conversations and I can't see the relation between those 
conversations and what I'm talking about here).

Could you please refer me to some document so that I can try to understand 
the provenance data model?

You're right: unless we are not able to use the same vocabulary with the 
same meaning all the conversations are going to be crazy.

I would really appreciate that you point me to any document that I can 
study about provenance so that I can fill the gaps in my knowledge.


  On Tue, 2 Jun 2009, bonnarel wrote:

> Hi all,
>   My personnal view about this.
>    A ) a question of vocabulary
>        Up to now an IVOA characterization has been reserved vocabulary for
> description of a dataset or an observation in the Physical parameter space
> of the data. What Carlos or Miguel would like to call "characaterization" is
> more something like the "Provenance" of the dataset. Again all this is a
> vocabulary question. But if we don't agree on the vocabulary how can we do
> Interoperability?
>     B ) The spectral DM is conceptually a simple and peculiar case of an
> overall Observation or Generic Dataset Model. The current version of
> spectrum doen't have the "Provenance " package. But it would be really easy
> to add this package in a future version of Spectrum, because The Obs DM
> currently being developed (with Provenance in it) is very similar in overall
> structure to Spectrum DM.
>     C ) The provenance that IVOA is currently developing integrates the
> software provenance. In the case of a theoretical dataset it would be a
> place to hook necessary information described according to SimDB I guess...
>     D ) a Service giving access to Theoretical spectra compliant with SSA
> May really have hooks to Provenance and SimDB information because additional
> fields and Extensions resources in the query response are allowed by the
> protocol. They may have Provenance or SimDB utypes without difficulty.
>   So If you agree with this general view It would be nice to have input on
> what we could have in Provenance for the use case of simulated
> observations...
> Cheers
> François
> Hi Carlos
>>>> The Spectra datamodel is perfect for most of the issues, but the
>>>> characterization in SimDB provides a better description of
>> what the
>>>> theoretical spectra is.
>>> Dear Miguel,
>>> May I ask you what is missing in the SpectrumDM?
>>> What is SimDB offering extra, specific to spectra?
>> The main point, as I see it, is that the SpectrumDM is
>> designed having observed spectra in mind, no theoretical ones.
>> The model contains everything that is necesary to describe
>> the content of a spectrum (the wavelength, flux and all that,
>> and this is the same for observed and theoretical ones) but
>> nothing of what is needed to __characterize__ a theoretical spectrum.
>> For instance, a theoretical spectra is usually characterized
>> giving, at
>> least:
>> - the code used to synthetize it.
>> - the effective temperature of the star
>> - the gravity (logg) of the star
>> - the metallicity of the star
>> (and sometimes some other parameters)
>> And, in the spectrum data model there is no utype for those
>> properties.
>> Making a long history short, if two different developers make
>> two different services with theoretical spectra and one
>> chooses "Meta" for the parameter containing the value for the
>> metallicity and the other chooses "Z" for the same parameter,
>> a client/application does not have a way to know that both
>> refer to the same concept (and UCD's are not enough for
>> this)
>> By the way, I think that SimDB doesn't solve that problem
>> either, am I right?
> That depends on what you expect from SimDB.
> That is, SimDB could allow you to define in some detail what code was used
> to produce synthetics spectra, though it may need some additions to the
> model as discussed in previous emails.
> The code is represented by the SimDB:Protocol, which contains input
> parameters, physics, algorithms and allows
> you to describe what is contained in a result
> (SimDB:RepresentationObjectType). The input parameters have a name as well
> as a "semantic label", which may be a UCD or something more generic. So if
> metallicity is in that vocabulary you can describe this. SimDB allows you to
> find all protocols that use a metallicity in their list of input parameters.
> The actual experiment that you run to produce your synthetic spectra is
> described amongst others by the values you assign to the parameters.
> Note that I am not suggesting that there can not be other, possibly more
> explicit models for theoretical/synthetic spectra. The SimDB data model is
> rather abstract, i.e. not very concrete, as it aims to support many types of
> siimulations and simulation codes etc. The SimDB data model could serve as
> the basis from which to derive more concrete models, but it may not serve
> the purposes of SimDB to do this in SimDB itself.
> For example if there is a particular set of parameters that all
> codes-producing-synthetic-spectra use, one could create a subclass of
> SimDB:Protocol that has these explicitly as attributes.
> For example a SyntheticSpectralModel could (I do not say should! It seems
> rather specialised and in need of discussion with a larger group of
> astrophysicists) have an attribute "metallicity" for example. Such a model
> will now give rise to a corresponding UTYPE. Something similar occurs in the
> SimDB data model where the SimDB:Snapshot is a special type of result (for
> 3+1D simulations) and has explicit attributes like spatialSize and time.
>> I think that the spectrum data model should contain a section
>> for characterization of theoretical data providing utypes
>> for, at least, a minimum set of parameters associated to
>> theoretical spectra.
> One may argue that this is not "characterisation" but "provenance",
> something that the spectral data model does not deal with in detail.
>> The fact is that SSAP/SpectrumDM was done for observed
>> spectra, it considers a lot of details about them, etc but it
>> included theoretical spectra just as a use case in an appendix.
>> If it is going to be mandatory to use the same schema for
>> theoretical spectra and it is expected that we do it (let's
>> say) for ever, a little amount of time should be dedicated to
>> fill the holes in the protocol/data model when it refers to
>> theoretical spectra.
> Correct, but I would second Rick in proposing this not be done in the
> current effort on SimDB or SimDAP/S3, at least not in their version 1.0.
> If you can come up with a more concrete model for synthetic spectra,
> possibly derived from SimDB/DM, you can easily create a service spec around
> this by mapping the model to a relational representation and using TAP for
> access.
> Cheers
> Gerard

