XML Schema for the Simulation Data Model

Fri Feb 15 12:08:27 PST 2008

Hi Gerard,

(To the other readers, this is much shorter, and only addresses a  
couple points.)

Thanks to your previous comments, I have a much better understanding  
of what implied in the design of the SNAP data model as it is now.  
And, yes, the repeated use of the term "Object" got very misleading.  
ObjectType could be renamed to CharacterizableType, but putting your  
previous explanation in the documentation is probably sufficient for  
the newcomer.

> There is likely a place for some kind of characterization in the  
> target
> object type. For example, I may be interested in simulations that  
> create a
> cluster of about 1e14 M_solar. This was noted by Laurent and Herve  
> as well
> and is currently not (yet) included in the model.

This is a good use case, since it applies to the work of many of us  
involved. And it gives me a chance to use an example to clarify  
whether or not I understand using properties for characterization.

If I am doing a cosmology simulation, I would define a TargetObject  
of "galaxy clusters", with properties that give a rough bin count of  
the number of clusters found

<TargetObject>
   <astroJournalSubject>clusters: galaxies</astroJournalSubject>
     <property ID="n_cluster_gt_1E14">
       <name>Cluster count gt 1E14 Msolar</name>
       <datatype>int</datatype>
       <Cardinality>0..1</Cardinality>
    </property>
</TargetObject>

and then, in a Snapshot, I could reference this property

<snapshot>
   <objectCollection>
     <characterization property="n_cluster_gt_1E14">
       <nominalValue>100</nominalValue>
     </characterization>
   </objectCollection>
</snapshot>

I hope this is what is intended, because otherwise I'm really missing  
something.

Also, I somewhat think that the simulation (or SNAPExperiment) should  
be allowed to have characterization elements, because could provide a  
summary of the experiment to be returned in a search, like the min  
and max values of this cluster count. But, as you've noticed, I like  
things normalized.

> Ok. The protocol can/must still be worked out in more detail. I  
> know Franck
> and coworkers were thinking about this as well, I think for the  
> case of
> registering simulation and related codes. For the use-case of  
> discovery we
> may need not everything that could be said about them, but it can  
> not hurt
> to analyse this part of the model somewhat further.

<snip>

>> Another comment on Program vs. Protocol and Simulator: Protocol and
>> Simulator just seemed more abstract than necessary, and used new
>> terms to describe things where old ones were fine (software, program,
>> code, etc.). The idea of an executable with input and output just
>> gets totally lost in the SNAP model. Now, I will admit that the idea
>> of documenting protocols is very attractive, but perhaps that's
>> something for SNAP 2.0.
>>
> I borrowed the term protocol from an early model
> http://www.ivoa.net/internal/IVOA/IvoaDataModel/DomainModelv0.9.1.doc
> which itself was inspired by a similar construct in a book on  
> "Analysis
> patterns". It can be used also to describe how one can do a telescope
> observation, or calibration. SNAPProtocol-s will always include  
> some kind of
> Program, but that is not the only thing we need to describe it. The  
> various
> protocols are an attempt at classifying the different types of  
> codes and
> their maning and we need to decide how far we want to go.
> A somewhat more refined classification allows one to ask questions  
> like
> "give me all the group finders you know of in the SNAP registry".
> Also, implicitly it allows one to classify the different types of
> experiments (and their results etc), simply by the type of protocol  
> they
> refer to.

 From our (my research group's) perspective, we just tend to think in  
terms of software, and the algorithms they implement, and I think  
this is common in computational science. To make it more rich, and  
enable searches for just group finders, or just n-body codes,  
Algorithms could be classified along the lines of "group finder" or  
"MHD". I do not know if this is sufficient to capture all that is  
intended in the SNAPProtocol, but it may be a start.

--Rick