XML Schema for the Simulation Data Model

Gerard gerard.lemson at mpe.mpg.de
Tue Feb 12 05:56:39 PST 2008


Hi Rick
Thanks for posting your work on the theory mailing list.
Please let's keep discussing the work here and not go offline too soon.


I am trying to understand the relation of your schema to the SNAP data model
and schema we are working on. To this end I have created a UML version of it
which I have attached as a JPG. I also attached three JPGs of the SNAP model
as today updated on the theory wiki. The updates are not very involved,
mainly some details refined and cleaned up.

Comparing your model with this latest version of the SNAP DM I think the
following correspondences can be made (I am ignoring detailed differences in
attributes etc) :

Rick's model 			SNAP model			
----------------------------------------
ProgramType				SNAPProtocol, SNAPSimulator
SimulationType			SNAPProject (1 below)
RunType				SNAPSimulation (1)
CharacterisationAxisType	Property (of ObjectType) (2 below)
CharacterisationType		Characterisation
ParameterType			InputParameter+ParameterSetting 
inputSnapshot			InputDataset

Comments and questions:

1. You have a SimulationType and a RunType. The latter seems to correspond
to a SNAPSimulation, as it contains the collection of input parameters and
snapshots and has its own reference to a ProgramType. 
At first I assumed that your SimulationType corresponded to a number of
SNAPSimulation-s, all with the same program and characterization. But from
the example instance document you sent around I guess it is actually more
like the SNAPProject. Is this correct?
Your SimulationType has a reference to ProgramType as well. Is this supposed
to mode a kind of pipeline?

2. The concept of ObjectType is missing in your model. This makes it
impossible to have multiple explicitly defined types of objects inside a
single simulation. In the SNAP DM each object type is defined explicitly
with its own set of properties. Note that my choice of using the name
Property has been a point of contention by some of the Characterisation DM
people, who wanted me to use Axis. I see you have chosen their side ;).
I feel that for many simulations the ObjectType is definitely and explicitly
present. For example some of the SPH simulations I have access to here have
dark matter, star and gas particles, each with its own properties.
I can see though that when someone's database is ever going to contain only
one type of simulation, one might want to remove the extra "indirection" of
the ObjectType.
Obviously related to this is the absence of ObjectCollection. In the SNAP
model this is the anchor that ties a list of characterizations to the
properties of a particular object type. If you remove one, you can remove
the other.
Note that only today I added a ChildObject to the model. This is the outcome
of an offline (sorry!) discussion with mainly Laurent Bourges and Herve
Wozniak. They model galaxies being built from disks and bulges, each with
their own properties. 

3. I assume Group and GroupedQuantity are borrowed from the Spectrum data
model's XSD serialization?  Because of single inheritance you have a problem
with ProgramType, which can now not be a Resource. If instead you had made
Resource a Group (impossible of course in the IVOA context), you could have
ProgramType be a Resource as well. 
I must say I don't like Group very much. ID and IDREF are useful only when
the element being referenced exists in the same XML document. This I think
will often not be the case. I see it as an example of inheritance run wild.

Btw, I have for a while wanted to remove the inheritance of Resource from
the SNAP data model, and done so in today's update. It is too restrictive I
find. I think one can take a SNAP model instance and turn it into a Resource
if one wants to register it, but that does not mean it "is a" resource in
our model. There are more flexible ways of using existing models than always
using inheritance. In particular the Content of Resource is very cumbersome.
The SNAP model is supposed to describe the Content already. 

4. You have InputParameter and ParameterSetting merged into 1,
ParameterType. Note that I have added an attribute "value", representing the
"xsd:string" inheritance in your ParameterType. In an earlier version of the
SNAP DM I had made the same choice for simplicity. However Franck LePetit
for example agrgued that redefining the list of parameters for his
simulation types would be very costly. 
If one runs parameter studies with lists of 100s of parameters it is better
to have the parameters defined once on the Protocol (where they belong
really), and only add the parameter settings on the experiment. Problem is
that in XML this is often more involved, as one needs to somehow reference
the parameter that may not exists in the same XML document (so IDREF will
not work) etc etc. 
Again, in one's particular database I can well see people choosing one or
the other. For the SNAP DM I have now chose for the more correct way.

5. You do not have TargetObjectType, TargetProcess, Algorithm, Physics,
SNAPWebService. These were all introduced explicitly to support discovery
(first 4) and execution (SNAPWebService) in the SNAP protocol. 


All in all it seems though that the models are pretty compatible, with the
SNAP model being more general and comprehensive, as one should expect for a
model that needs general application. 
For now I see your model as an alternative representation of (a subset of)
the information in the full model, that can have its particular application
area. In that it would be similar to similar models for example from
Patrizia Manzato and from the Horizon team (see the links in the "Existing
data models..." paragraph in
http://www.ivoa.net/twiki/bin/view/IVOA/IVOATheorySimulationDatamodel )


________________________________________
From: owner-theory at eso.org [mailto:owner-theory at eso.org] On Behalf Of Rick
Wagner
Sent: Tuesday, February 12, 2008 2:08 AM
To: theory at ivoa.net
Subject: XML Schema for the Simulation Data Model

Hi,

After working to understand the current SNAP Data Model (in particular the
current proposed XML Schema), I decided to distill it into a single document
with fewer types. I've had some success, so I've post the Schema, and a
sample instance document on the Twiki attached to the
IVOATheorySimulationDatamodel page:

Schema
http://www.ivoa.net/internal/IVOA/IVOATheorySimulationDatamodel/Simulation.x
sd

Sample Instance
http://www.ivoa.net/internal/IVOA/IVOATheorySimulationDatamodel/SimulationIn
stance.xml

At the bottom of the page there are links screen shots of the elements and
data types, which help to show they're relations. 

This schema keeps the method of characterization as the SNAP model, by
defining the axes up front (or, at the top), but is less abstract. It treats
a simulation and its data as the results of running a program with defined
input parameters, and does not try describe everything about the method and
numerical representation. To me, these are things defined by the program
(the software), and could be handled by defining separate VOResource for
"Program" or "Software Project".

If this works looks interesting to anyone, I would be glad to write up a
fuller description, any even put some documentation in the Schema and
instance documents.

I plagiarized heavily from both them SNAP Data Model, and the Spectral
Schema, so any credit should go to Gerard and the DAL, Data Modeling group,
and annoyed comments sent my way.

--Rick

-------------------------------------------------------------------------
Rick Wagner, Graduate Student Researcher
UCSD Physics
9500 Gilman Drive
La Jolla, CA 92093-0424
Email: rwagner at physics.ucsd.edu
WWW: http://lca.ucsd.edu/projects/rpwagner
(858) 822-4784 Phone
-------------------------------------------------------------------------
Measuring programming progress by lines of code is
like measuring aircraft building progress by weight.
--Bill Gates
-------------------------------------------------------------------------



-------------- next part --------------
A non-text attachment was scrubbed...
Name: RickWagner.jpg
Type: image/jpeg
Size: 208485 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/theory/attachments/20080212/43c71f28/attachment-0004.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SNAP__postprocessing.jpg
Type: image/jpeg
Size: 76092 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/theory/attachments/20080212/43c71f28/attachment-0005.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SNAP__simulation.jpg
Type: image/jpeg
Size: 65149 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/theory/attachments/20080212/43c71f28/attachment-0006.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SNAPDataModel.jpg
Type: image/jpeg
Size: 491340 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/theory/attachments/20080212/43c71f28/attachment-0007.jpg>


More information about the theory mailing list