Comments on SimulationDM version 1.0 WD

Carlos Rodrigo crb at cab.inta-csic.es
Tue May 3 11:42:49 PDT 2011


First of all, my apologies for not sending anything before. I've been too busy lately
with other things and I didn't find enough time for this.

>From my point of view, the most important part of the mail is at the end, where I
try to apply the datamodel to a simple case. Please, if you are interested, try to
go that far :)

My first question is:

Which document/s are we discussing?
-----------------------------------

The document that was submitted to the list (and is available in the wiki page), that is,
the DataModel document, makes reference to other "accompanying documents" very often. For
instance, it mentions the Appendix many times.

This Appendix is not part of the document and was not submitted to the list together with
the simDM document.

The important question is: What are we revising here? the submitted document or it together
with the mentioned set of "Accompanying documents"?

In fact, when I've tried to use the links at the end of the document, they don't work (at
least for all the cases that I've tried). Navigating the volute web page I've been able to
find documents with similar names that I assume that are the referred ones, but I can't be
sure.

In the particular case of the "Appendix", it seems clear that it is a draft (with comments
like "TBD review this section", "@@TODO discuss this further @@", it etc). This is ok in a
draft, but it shows that it is not a final document. So I assume that it is not part of this
discussion, is it?

Isn't it too summarized?
------------------------

If this is the document for the Simulation data model I imagine that it should contain a
description of all the elements in the model, and that if something is not described here,
it is not part of the model.

4.2 Thinks like "There is the Party class, which represents an individual or organisation,
and is not so important for the moment." and then no other comment is made about this class
in the document (although it seems to be seen as part of the data model).

In other words: is the "party class" part of the model? or it isn't yet but we intend to
include it in future versions? or is it explained in an "accompanying document"?

In general, I have the impression that this document tries to be a friendly general
description of the model, without going to the hard details, probably tending to summarize
with the intention that it is easier to understand. But I think that the details should be
part of the document.

*************************************************************

Punctual comments
-----------------
The first thing is again: couldn't we find another word for (experimental) Protocol?? not
just adding a "(experimental)" in italics before it?

It was very clear at Nara that using this word in a VO context is extremely confusing.
And, if we are going to continue using it, we should write the "(experimental)" adjective
everywhere, without exceptions (including figures).

----

In general, I think that there are too many references to protocols and simDB in the
document. I understand that some comments must be done because the modeling is done in such
a way that the data can be accessed. But the data model is about how the data is
represented/modelled and not about how the data is accessed. And I think that it would be
better to have as few references to protocols as posible.

Service/Webservice
--------------------

I've found it quite difficult to understand where the "service" class is located in the
model (I'm not completely sure that I understand it yet). Looking at figures 1 and 2 I
assumed at first that a service is specified for each experiment (which would mean that
there is one service for each run of the code). But looking at figure 3 and section 3, and
then section 4.8, I see that both "Service" and Experiment (and Protocol and Project) are
subclasses of Resource. For me this would mean that there is one service for each resource
(although the resource contains several experiments run under a given protocol) and that
this service can be used to access the results of all the experiments. Am I right? If so, I
think that figures 1 and 2 are a little misleading.

----

Section 2: History
--------------------

I don't think this section is very important, but given that it is part of the document, I
would make a couple of small changes.

(a) It is true that some types of simulations (the cosmological ones, for instance) are
often made in big collaborations. But it is also true that other types of simulations are
performed by a team of one astronomer plus one student. And, and least, I wouldn't say that
the first case is more usual. In fact I'm convinced that this is a point that should be
considered: many (if not most) theoretical simulations are made by small groups.

Thus, I would delete the sentence "and is these days often performed in large collaborations".

(b) When it starts talking about S3 it says: "A recent effort has been"... I don't think it
is so recent.

The Note is more that two years old and the idea had been already presented in March 2007 at
the "Astronomical Spectroscopy and the Virtual Observatory." workshop (without the S3 name).
Actually it was a parallel effort, for microsimulations, when other people were focused on
3+1 simulations. I would just change it to "Another effort", "A different approach" or
something like that. And, by the way, it is not a result of an investigation started at
Cambridge as we were working in it before Cambridge (and had a first version implemented for
isochrones and evolutionary tracks, not just theoretical spectra).

"S3 is actually a direct reworking " to "S3 is a generalization ". I don't know what "a
direct reworking" really means, but I don't think S3 is a direct reworking of TSAP at this
stage (maybe it was at the end of 2006 when we first implemented it).

(c) We didn't really decided at Victoria that S3 and SimDAP would be merged in a single
protocol named SimDAL. We decided that it would be nice to do so and that we would
investigate if it is possible or not. Actually I don't know enough what SimDAP means to be
able to say how both protocols could be merged, or if they can be merged or if it is a good
idea to merge them in an only protocol. But, in any case, this is not the subject of this
document.

In fact I think that stating the intentions of the TIG about protocols and so (and what is
decided or not about that) is out of the scope of a data model document. Even though this is
the historical introduction, I don't think it is necessary (or even good) to include such
statements. Thus, better than discussing about it, I would drop the last paragraph.


Section 3.1
-----------

"and SimDAP has merged with S3 to form SimDAL: a family of access protocols for theory data"

I would change "merged" to "joined" or something like that. At this stage, both things are
included under the SimDAL concept, but we don't really now yet if they will be an only
protocol, two protocols, two flavours of the same thing or what. Actually, the word "family"
seems to imply that there will be more than onw "flavour" and, for me, seems contradictory
if we say that they are merged.

**************************************************************************

How to use the model
--------------------

It is quite clear that this datamodel has been made mostly with two ideas in mind:
cosmological 3+1 simulations and a data base of simulations (mostly of this type).
This is quite obvious throughout the document (and it's perfectly understable given
the history).

But, given that this is intended to be THE model for simulations and that I have implemented
more than 20 services for theoretical data, I have to try to find the correspondence between
the concepts that I use for those services and the ones in the data model. And I must say
that it is not easy even for the most simple models. And what worries me is that, if it is
not easy for me (I'm not an expert in datamodeling, object oriented programming, uml and so,
but I've been attending talks and discussions about this for a long time), how will it be
for scientists who have their grid of models and want to make them available in the VO?

I think that we very much need, at least, a simple recipe on how to implement this data
model for simple cases (let's say: those usually called microsimulations). And probably some
examples should be included in the document so that data providers have an starting point
to this complex standard.

And I assume that it is partly my assignment to do such a thing (or at least I would like
to be able to do it). But I'm still not sure if I am able to figure out how to do it.


A try to make a couple of examples
----------------------------------

When writing a votable containing data for a theoretical simulation, I assume that the
datamodel should be useful to better characterize the content of that votable (concepts and
relations between them).

Thus, my main exercise has been trying to use this datamodel in a extremely simple case.
I have a collection of theoretical isochrones and I want to rewrite it adding utypes from
the datamodel.

And I must say that I'm not sure at all about how to do that.

The main idea to be able to say:

* This is an isochrone
* It is the isochrone for an star (or set of stars, let's say "star" for simplifying)
* It has been calculated with the Baraffe et al model
* The parameter is the star age.

And, even:

* the isochrone is a table with four columns:
- mass
- effective temperature
- logarithm of gravity
- bolometric luminosity

In a simple model I would imagine something like this:

<PARAM name="model"  utype="SimpleSimDM:model" value="Vandercroft"/>
<PARAM name="object" utype="SimpleSimDM:targetObject" value="star"/>
<PARAM name="INPUT:age" value="0.00100" unit="Gyr" ucd="phys.age"
utype="SimpleSimDM:inputParameter"/>

<TABLE>
<PARAM utype="SimpleSimDM:product" value="isochrone"/>
<FIELD name="mass" utype="SimpleSimDM:product.property"/>
<FIELD name="teff" utype="SimpleSimDM:product.property"/>
<FIELD name="logg" utype="SimpleSimDM:product.property"/>
<FIELD name="Lum" utype="SimpleSimDM:product.property"/>

(...)

But this SimDM model is not simple. It is complex and very hierarchical.
And it is known that this hierarchical structure is not easy to represent in a flat document
as a votable.

I just try to identify the utypes more adequate for each concept and add relations (by
grouping) with the semantic labels so that eventually they can be liked to some vocabulary.

And I even get a little lost here. I get confused by the ObjectType, TargetObjectType,
RepresentationObjectType, ExperimentRepresentationObject, RepresentationObject... I'm not
very sure what I should use for just saying "this is a star", what for saying "this is an
isochrone", what for representing the object contained in the isochrone (with its
properties, mass, logg, etc...). I get the feeling that this is quite a flexible and
interesting idea, but I don't really understand much about all these classes named *Object*
without some examples.

Finally:

I have writen two quite simple votables:

- the first one contains one isochrone and I try to use the datamodel in it.
- the second contains a list of isochrones for a given range of ages.

Could someone tell me (probably Gerard) if they make sense?

(In some cases I have added groups with only one param inside, which is quite unnecessary. I
do it just to show the idea that something else could be added, if needed, as related to
that param)

If you make some comments and help me about this, I would try to do something similar for
the much more complicated case of asteroseismology simulations.

By the way, I don't find in the model any "box" to specify a bibliografic reference. I think
it would be important to address that too.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: isochrone.xml
Type: text/xml
Size: 2368 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/dm/attachments/20110503/e945218a/attachment.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: isochrone-list.xml
Type: text/xml
Size: 2375 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/dm/attachments/20110503/e945218a/attachment-0001.xml>


More information about the dm mailing list