Comments on the Simulation Data Model
Rick Wagner
rwagner at physics.ucsd.edu
Thu Mar 12 02:11:35 PDT 2009
Hello All,
After working for a while on a SimDAP Note with Claudio, and
developing both a SimDAP service, and a simulation catalog using the
current data model, I have some suggestions. These come after coding
all ~43 classes in a relatively simple ORM, and from the fact that
SimDAP needs the data model to adequately describe the data returned.
I think these will simplify the model a little, without losing any of
the information contained in it. Also, these changes may open up the
possibility of describing more data and protocols with the model.
Here they are, in a semi-dependent order:
1) Remove the Simulator, PostProcessing, and ClusterFinder classes.
All these classes provide is a very limited taxonomy. Instead, add a
"Class" or "Type" attribute to the Protocol class. This attribute can
be an enumeration, like the RepresentationObject, e.g., "simulator",
"initial conditions generator", "cluster finder", "custom", etc. The
collection of Physics instances can be brought up to the Protocol
level, since many Protocols model physical processes, not just
simulators.
2) Similarly, remove the Simulation, PostProcessing, and
ClusterDetection. The type of these experiments is defined by the
type of protocol they are created with. Again, the AppliedPhysics
collection can be moved up the Experiment class, along with the
reference to the protocol, and the execution time.
3) Remove ExperimentRepresentationObject and ExperimentProperty. I've
brought these up before, and I still think they are being used to
represent a linking table that doesn't need to be explicitly
declared. There are 1..* references from the Experiment to a
representation's properties; from there the representation can be found.
4) CompositeProtocol and CompositeExperiment could go (and therefore
ChildProtocol and ChildExperiment). While I can see a use case for
defining a CompositeProtocol for running an experiment, I'm not sure
it's necessary for describing one. And, it gets confusing that
CompositeProtocol can define its own parameters and representations,
and so can the ChildProtocols. This makes it unclear where to define
these things. Likewise for CompositeExperiment. And, the Project
class serves as another mechanism to aggregate experiments.
My final comment is a suggestion for the contents of the Note on the
data model:
1) An overview of the model, including the packages and major classes
(Experiment, Protocol, Snapshots, etc.).
2) A discussion of how characterization is treated in the model.
3) XML Schemas for the top-level container classes (Protocol,
Experiment, etc.).
4) VOTable serializations of the the container classes. (This is of
particular interest to me, since I've been working on that as part of
the SimDAP Note.)
5) The table and catalog definitions can in as a reference for anyone
building a VODataService using the model.
3, 4 and 5 can go in as appendices, and the automatically generated
documentation can either be an appendix or stay as stand-alone
document. Once TAP is sorted out, writing the SimDB standard should
mostly consist of writing an XML Schema that subclasses the TAP
registry schema, and providing whatever description of the database
schema is required for TAP.
Sincerely,
Rick
------------------------------------------------------------------------
-
Rick Wagner, Graduate Student Researcher
UCSD Physics
9500 Gilman Drive
La Jolla, CA 92093-0424
Email: rwagner at physics.ucsd.edu
WWW: http://lca.ucsd.edu/projects/rpwagner
(858) 246-0745 Phone
------------------------------------------------------------------------
-
I know just enough vi to get Emacs installed.
--Rick Wagner
------------------------------------------------------------------------
-
More information about the theory
mailing list