Documentation for Provenance

bonnarel at alinda.u-strasbg.fr bonnarel at alinda.u-strasbg.fr
Wed Jun 3 00:36:10 PDT 2009


HI all,

     Material for IVOA provenance is not yet complete.
     The basic ideas can be found here:
http://www.ivoa.net/Documents/latest/DMObs.html

      You can also read  the discussion we had in the small OBs team
before the Baltimore meeting.
       Peculiarly these compilation mails:
http://www.ivoa.net/forum/dm/0810/1480.htm
as well as
http://www.ivoa.net/forum/dm/0810/1475.htm
and
http://www.ivoa.net/forum/dm/0810/1470.htm

     You can also read my presentations in Trieste, Baltimore and Strasbourg
interops (mainly the two ,latter)

      I attach the first Provenance example (in xml)

This material will be presented for comments on the DM pages with an xml
schema by next the end of next week...

Regards
François
Quoting Rick Wagner <rwagner at physics.ucsd.edu>:

> Hi,
>
> I will also try to be brief, and state the points I see as important
> up to this point.
>
> 1) What we are calling Characterization in the simulation data model,
> is restricted to only provenance information. For example, the star
> formation rate or RMS Mach number may be represented, and these are
> physical properties measure from the data (in my cases).
>
> However, there are some properties, such as resolution (spatial, or
> frequency) which are known beforehand. For this reason, there is a
> boolean "a priori" attribute on the Characterization class. This
> distinction may need to be made more clear, in order to align our
> model with the Characterization Data Model.
>
> 2) The discussion about the relationship between a data model for
> theory data, and data models for observational data is not specific
> to spectra. As I mentioned in my initial response, I think the
> appropriate thing to do when presenting spectra or images in a
> context with a specific model (e.g., SSA or SIA) is to use the
> corresponding model.
>
> Having an additional box, such as "Provenance", hanging off of a
> Generic Data Set class may be to include the additional information
> from the simulation data model. But, because models for theory data
> must be more general, the data may be de-normalized.
>
> 3) The question is still open as to whether or not we need to
> incorporate these thoughts into the current simulation data model, or
> if we can bridge the theory and observational models in the future.
>
> --Rick
>
> On Jun 2, 2009, at 9:43 AM, Jesus Salgado wrote:
>
>> Hi all,
>>
>> Very short. I fully agree with Francois. I think all this  discussion can
>> be solved by adding a Provenance box to the current Spectrum DM, so we
>> can reuse Spectrum specific utypes for the spectrum part of the
>> theoretical spectra and the provenance box for the software, input
>> parameters, etc.
>> The boxes from SIMDB could be reused in this case but we should not
>> start from scratch just because there are some fields missing in the
>> current Spectrum DM.
>>
>> As I said during last interop sessions, I am a little more worried  about
>> how to characterize the output records of a S3 (or SIMDAP) server when
>> the response is _not_ spectra.
>>
>> Best Regards,
>> Jesus
>>
>> On Tue, 2009-06-02 at 12:30 +0200, bonnarel wrote:
>>> Hi all,
>>>    My personnal view about this.
>>>
>>>     A ) a question of vocabulary
>>>         Up to now an IVOA characterization has been reserved  
>>> vocabulary for
>>> description of a dataset or an observation in the Physical  parameter space
>>> of the data. What Carlos or Miguel would like to call  
>>> "characaterization" is
>>> more something like the "Provenance" of the dataset. Again all  this is a
>>> vocabulary question. But if we don't agree on the vocabulary how  can we do
>>> Interoperability?
>>>      B ) The spectral DM is conceptually a simple and peculiar  case of an
>>> overall Observation or Generic Dataset Model. The current version of
>>> spectrum doen't have the "Provenance " package. But it would be  
>>> really easy
>>> to add this package in a future version of Spectrum, because The  Obs DM
>>> currently being developed (with Provenance in it) is very similar  
>>> in overall
>>> structure to Spectrum DM.
>>>      C ) The provenance that IVOA is currently developing  integrates the
>>> software provenance. In the case of a theoretical dataset it would  be a
>>> place to hook necessary information described according to SimDB I  
>>> guess...
>>>      D ) a Service giving access to Theoretical spectra compliant  with SSA
>>> May really have hooks to Provenance and SimDB information because  
>>> additional
>>> fields and Extensions resources in the query response are allowed  by the
>>> protocol. They may have Provenance or SimDB utypes without  difficulty.
>>>
>>>    So If you agree with this general view It would be nice to have  
>>> input on
>>> what we could have in Provenance for the use case of simulated
>>> observations...
>>>
>>> Cheers
>>> François
>>>
>>> -----Message d'origine-----
>>> De : Gerard [mailto:gerard.lemson at mpe.mpg.de]
>>> Envoyé : mardi 2 juin 2009 12:07
>>> À : 'Carlos Rodrigo Blanco'; 'Alberto Micol'
>>> Cc : theory at ivoa.net; dm at ivoa.net
>>> Objet : RE: Spectra DM for theoretical spectra?
>>>
>>>
>>> Hi Carlos
>>>
>>>>
>>>>>> The Spectra datamodel is perfect for most of the issues, but the
>>>>>> characterization in SimDB provides a better description of
>>>> what the
>>>>>> theoretical spectra is.
>>>>>>
>>>>> Dear Miguel,
>>>>>
>>>>> May I ask you what is missing in the SpectrumDM?
>>>>> What is SimDB offering extra, specific to spectra?
>>>>
>>>> The main point, as I see it, is that the SpectrumDM is
>>>> designed having observed spectra in mind, no theoretical ones.
>>>>
>>>> The model contains everything that is necesary to describe
>>>> the content of a spectrum (the wavelength, flux and all that,
>>>> and this is the same for observed and theoretical ones) but
>>>> nothing of what is needed to __characterize__ a theoretical  spectrum.
>>>>
>>>> For instance, a theoretical spectra is usually characterized
>>>> giving, at
>>>> least:
>>>>
>>>> - the code used to synthetize it.
>>>> - the effective temperature of the star
>>>> - the gravity (logg) of the star
>>>> - the metallicity of the star
>>>>
>>>> (and sometimes some other parameters)
>>>>
>>>> And, in the spectrum data model there is no utype for those
>>>> properties.
>>>>
>>>> Making a long history short, if two different developers make
>>>> two different services with theoretical spectra and one
>>>> chooses "Meta" for the parameter containing the value for the
>>>> metallicity and the other chooses "Z" for the same parameter,
>>>> a client/application does not have a way to know that both
>>>> refer to the same concept (and UCD's are not enough for
>>>> this)
>>>>
>>>> By the way, I think that SimDB doesn't solve that problem
>>>> either, am I right?
>>>>
>>> That depends on what you expect from SimDB.
>>> That is, SimDB could allow you to define in some detail what code  was used
>>> to produce synthetics spectra, though it may need some additions  to the
>>> model as discussed in previous emails.
>>> The code is represented by the SimDB:Protocol, which contains input
>>> parameters, physics, algorithms and allows
>>> you to describe what is contained in a result
>>> (SimDB:RepresentationObjectType). The input parameters have a name  as well
>>> as a "semantic label", which may be a UCD or something more  generic. So if
>>> metallicity is in that vocabulary you can describe this. SimDB  
>>> allows you to
>>> find all protocols that use a metallicity in their list of input  
>>> parameters.
>>> The actual experiment that you run to produce your synthetic  spectra is
>>> described amongst others by the values you assign to the parameters.
>>>
>>> Note that I am not suggesting that there can not be other,  possibly more
>>> explicit models for theoretical/synthetic spectra. The SimDB data  model is
>>> rather abstract, i.e. not very concrete, as it aims to support  
>>> many types of
>>> siimulations and simulation codes etc. The SimDB data model could  serve as
>>> the basis from which to derive more concrete models, but it may  not serve
>>> the purposes of SimDB to do this in SimDB itself.
>>> For example if there is a particular set of parameters that all
>>> codes-producing-synthetic-spectra use, one could create a subclass of
>>> SimDB:Protocol that has these explicitly as attributes.
>>> For example a SyntheticSpectralModel could (I do not say should!  It seems
>>> rather specialised and in need of discussion with a larger group of
>>> astrophysicists) have an attribute "metallicity" for example. Such  a model
>>> will now give rise to a corresponding UTYPE. Something similar  
>>> occurs in the
>>> SimDB data model where the SimDB:Snapshot is a special type of  result (for
>>> 3+1D simulations) and has explicit attributes like spatialSize and  time.
>>>
>>>
>>>
>>>> I think that the spectrum data model should contain a section
>>>> for characterization of theoretical data providing utypes
>>>> for, at least, a minimum set of parameters associated to
>>>> theoretical spectra.
>>>>
>>> One may argue that this is not "characterisation" but "provenance",
>>> something that the spectral data model does not deal with in detail.
>>>
>>>> The fact is that SSAP/SpectrumDM was done for observed
>>>> spectra, it considers a lot of details about them, etc but it
>>>> included theoretical spectra just as a use case in an appendix.
>>>>
>>>> If it is going to be mandatory to use the same schema for
>>>> theoretical spectra and it is expected that we do it (let's
>>>> say) for ever, a little amount of time should be dedicated to
>>>> fill the holes in the protocol/data model when it refers to
>>>> theoretical spectra.
>>>>
>>> Correct, but I would second Rick in proposing this not be done in the
>>> current effort on SimDB or SimDAP/S3, at least not in their  version 1.0.
>>>
>>> If you can come up with a more concrete model for synthetic spectra,
>>> possibly derived from SimDB/DM, you can easily create a service  
>>> spec around
>>> this by mapping the model to a relational representation and using  TAP for
>>> access.
>>>
>>>
>>>
>>> Cheers
>>>
>>> Gerard
>>>
>>>
>> -- 
>> Jesus J. SALGADO                       Jesus.Salgado at sciops.esa.int
>>
>> ESAC Science Archives Team
>> European Space Astronomy Centre (ESAC)
>> European Space Agency (ESA)
>>
>> European Space Agency/European Space Astronomy Centre
>> P.O. Box 78
>> 28691 Villanueva de la Canada                 Tel: +34 91 813 12 71
>> Madrid - SPAIN                                Fax: +34 91 813 13 08
>> -------------------------------------------------------------------
>>
>>
>> ====================================================================== 
>> ==========================
>> This message and any attachments are intended for the use of the  
>> addressee or addressees only. The
>> unauthorised disclosure, use, dissemination or copying (either in  
>> whole or in part) of its content
>> is prohibited. If you received this message in error, please delete  
>> it from your system and notify
>> the sender. E-mails can be altered and their integrity cannot be  
>> guaranteed. ESA shall not be liable
>> for any e-mail if modified.
>> ====================================================================== 
>> ===========================
>>
>>
>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: provenance.xml
Type: text/xml
Size: 3661 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/theory/attachments/20090603/11fddda1/attachment-0003.xml>


More information about the theory mailing list