Comments on the Simulation Data Model

Rick Wagner rwagner at physics.ucsd.edu
Thu Mar 12 02:11:35 PDT 2009


Hello All,

After working for a while on a SimDAP Note with Claudio, and  
developing both a SimDAP service, and a simulation catalog using the  
current data model, I have some suggestions. These come after coding  
all ~43 classes in a relatively simple ORM, and from the fact that  
SimDAP needs the data model to adequately describe the data returned.

I think these will simplify the model a little, without losing any of  
the information contained in it. Also, these changes may open up the  
possibility of describing more data and protocols with the model.

Here they are, in a semi-dependent order:

1) Remove the Simulator, PostProcessing, and ClusterFinder classes.  
All these classes provide is a very limited taxonomy. Instead, add a  
"Class" or "Type" attribute to the Protocol class. This attribute can  
be an enumeration, like the RepresentationObject, e.g., "simulator",  
"initial conditions generator", "cluster finder", "custom", etc. The  
collection of Physics instances can be brought up to the Protocol  
level, since many Protocols model physical processes, not just  
simulators.

2) Similarly, remove the Simulation, PostProcessing, and  
ClusterDetection. The type of these experiments is defined by the  
type of protocol they are created with. Again, the AppliedPhysics  
collection can be moved up the Experiment class, along with the  
reference to the protocol, and the execution time.

3) Remove ExperimentRepresentationObject and ExperimentProperty. I've  
brought these up before, and I still think they are being used to  
represent a linking table that doesn't need to be explicitly  
declared. There are 1..* references from the Experiment to a  
representation's properties; from there the representation can be found.

4) CompositeProtocol and CompositeExperiment could go (and therefore  
ChildProtocol and ChildExperiment). While I can see a use case for  
defining a CompositeProtocol for running an experiment, I'm not sure  
it's necessary for describing one. And, it gets confusing that  
CompositeProtocol can define its own parameters and representations,  
and so can the ChildProtocols. This makes it unclear where to define  
these things. Likewise for CompositeExperiment. And, the Project  
class serves as another mechanism to aggregate experiments.

My final comment is a suggestion for the contents of the Note on the  
data model:

1) An overview of the model, including the packages and major classes  
(Experiment, Protocol, Snapshots, etc.).

2) A discussion of how characterization is treated in the model.

3) XML Schemas for the top-level container classes (Protocol,  
Experiment, etc.).

4) VOTable serializations of the the container classes. (This is of  
particular interest to me, since I've been working on that as part of  
the SimDAP Note.)

5) The table and catalog definitions can in as a reference for anyone  
building a VODataService using the model.

3, 4 and 5 can go in as appendices, and the automatically generated  
documentation can either be an appendix or stay as stand-alone  
document. Once TAP is sorted out, writing the SimDB standard should  
mostly consist of writing an XML Schema that subclasses the TAP  
registry schema, and providing whatever description of the database  
schema is required for TAP.

Sincerely,
Rick

------------------------------------------------------------------------ 
-
Rick Wagner, Graduate Student Researcher
UCSD Physics
9500 Gilman Drive
La Jolla, CA  92093-0424
Email:  rwagner at physics.ucsd.edu
WWW:    http://lca.ucsd.edu/projects/rpwagner
(858) 246-0745 Phone
------------------------------------------------------------------------ 
-
I know just enough vi to get Emacs installed.

--Rick Wagner
------------------------------------------------------------------------ 
-



More information about the theory mailing list