Observation data model comments

Jonathan McDowell jcm at head.cfa.harvard.edu
Mon May 10 16:02:29 PDT 2004


Hi Anita,
 Here are my comments on your revised observation model.

I'll label the initial Observation V0.2 draft as
[V0.2] and Anita's draft as [R].

[Apologies for the typo in the earlier document: the file is called V0.2
but the document says V0.3. There is no difference. Oops.]

   ========================================
In the context of Anita's model I would like to distinguish between the
core (aggregated) parts of the V0.2 Observation model and things that
are mentioned but peripheral.

For peripheral objects:

The [R] ObservationGroup in Anita's model corresponds to our [V0.2] tree
model of an archive.

pThe [V0.2] MeasuredData object corresponds to what I have normally
called a source catalog - the decomposition of the observation into
distinct pieces (`sources') described by parameters. This is linked to
an Observation, but I don't see it as part of the Observation object.
However, by stretching our concept of Observation, we could think
of it as a separate Observation object (perhaps better, we could define
a more generic Dataset object which covers both as special cases), in
which the MeasuredData is the [V0.2] ObsData and the [R] AnalysisMethod is the
[V0.2] Provenance (processing). To say this another way: if we ask
`where did this data product D1 come from?' the answer may be 'I ran this
processing on another data product D2' or `I made this observation of the
sky and put the raw result in D1' or `I made this obs of the sky
and put raw stuff in D2, and then ran processing on D2 to make D1'.

Now in the context of our data model describing observational metadata,
maybe it doesn't matter if D1 is an image, or a catalog of derived
properties. In either case you have a description of the data product
(ObsData, MeasuredData) and a description of how you got it (AnalysisMethod,
Processing, ObservingProtocol, or whatever). So I'm not convinced that 
AnalysisMethod should be a separate class from Processing. I think it's
already covered in the diagram where you have another Observation 
coming in with the `composition' box.

I like the Community model idea; we should develop this.



Within the main Observation object:

[R] Project looks mostly fine, but should it be within Provenance?
And you have omitted [V0.2] Curation, which I think we will want
in the data supplied to the user. I would extend Project to include
the rest of Curation.

[R] Location Uncertainty: I don't think this should be with Location.
The Location is a vague nominal position. You want an uncertainty
for the detailed individual positions in the data. The ObsData
provides this via Quantity.

[R] Mapping: in [V0.2] the ObsData has all the information about
any pixelization, and the Characterization is exterior to this
and entirely in terms of world coordinates (it has axes that
aren't in the data, remember). If you need to go from the
Characterization to a pixel position you can ask the mappings
in the ObsData, but I don't think there should be any mappings
in the characterization.

   ========================================

Comments on radio fluxes: the "Jy/beam" problem. I think this is really
a modelling problem: "Jy/beam", "Jy/pixel" and "Jy/sq arcsec" are not
the same thing (UCD) with different units, they are different things
(which is obvious if you allow a variable beam size). So we should not
expect to handle the distinction with simple unit conversions.

In the Quantity model, there is a place to put the UCD of the observable.
We need to have a UCD for 'surface brightness' and a UCD for 'flux per resolution
element'. Then, software can exist to recognize that the conversion between
these UCDs requires data with a UCD of 'spatial resolution', and that this
can be found in the data model as part of the characterization.
So my key conclusion is that the data must be marked (by the data provider
or by the data model ingestion software) with UCDs or something equivalent
to allow conversions of this kind.

OK, now I'm only about 24 hr behind on the emails :-)

 - Jonathan



More information about the dm mailing list