Observation data model comments

Tue May 11 01:36:22 PDT 2004

>
> The [R] ObservationGroup in Anita's model corresponds to our [V0.2] tree
> model of an archive.

Actually that is changed now (see latest ObsDMamsr_0.11eg.png at
http://wiki.astrogrid.org/bin/view/Astrogrid/ObservationDataModelRevision
) - after discussion with Francois and Mireille we decided to remove
ObservationGroup and just allow Observation to be recursive.  It isn't the
same as an archive, it can involve extra or different processing.

> pThe [V0.2] MeasuredData object corresponds to what I have normally
> called a source catalog - the decomposition of the observation into
> distinct pieces (`sources') described by parameters. This is linked to
That's OK, although I think general names are better (I am well trained by
the CMB people not to assume that the only interesting measurements are
discrete objects with defined point-like positions).

> an Observation, but I don't see it as part of the Observation object.
> However, by stretching our concept of Observation, we could think
> of it as a separate Observation object (perhaps better, we could define
> a more generic Dataset object which covers both as special cases), in
> which the MeasuredData is the [V0.2] ObsData and the [R] AnalysisMethod is the
> [V0.2] Provenance (processing). To say this another way: if we ask
> `where did this data product D1 come from?' the answer may be 'I ran this
> processing on another data product D2' or `I made this observation of the
> sky and put the raw result in D1' or `I made this obs of the sky
> and put raw stuff in D2, and then ran processing on D2 to make D1'.
>
> Now in the context of our data model describing observational metadata,
> maybe it doesn't matter if D1 is an image, or a catalog of derived
> properties. In either case you have a description of the data product
> (ObsData, MeasuredData) and a description of how you got it (AnalysisMethod,
> Processing, ObservingProtocol, or whatever). So I'm not convinced that
> AnalysisMethod should be a separate class from Processing. I think it's
> already covered in the diagram where you have another Observation
> coming in with the `composition' box.

I think you may be right from a logically pure point of view, but from
looking at real data providers who want to publish their archives, it
seems to be the case that
a) ObsData can indeed be raw but most VO users would prefer something
with instrumental charactersitics removed e.g. a calibrated spectrum or a
CLEANed radio image;
b) The same observation can produce a range of ObsData depending on the
Processing version (or however we handle that);
c) In addition, you can measure the C_l's from a CMB power spectrum or the
positions of sources in an image in many different ways including using VO
tools.  In theory you could have just one class for each of ObsData and
Processing but in practice one particular ObsData+Processing version can
lead to a whole family of Measured Quantities with different
AnalysisMethods, and some of these may involve cross-referencing other
catalogues  (e.g. the 1XMM crossmatched catalogue).  I just think that it
is more functional to make a distinction.

However Francois and Mireille were less happy with MeasuredQuantities and
AnalysisMethod hanging off ObsData, preferring to relate everything back
to Observation.  I don't understand the principles of modelling well
enough to see why, and I would like to see how that could be applied to
real data.

> I like the Community model idea; we should develop this.
Guy Rixon is one of the experts on that

> Within the main Observation object:
>
> [R] Project looks mostly fine, but should it be within Provenance?
> And you have omitted [V0.2] Curation, which I think we will want
> in the data supplied to the user. I would extend Project to include
> the rest of Curation.

Project is a sub-class of Provenance as in general the relevant
information is associated with a particular data provider.  For Curation,
this time I think that should relate to the central Observation class so
that the Observation model can exchange information with the Registry
model which has very well defined (and agreed!) Curation classes.  I am
not sure in what sense you mean curation; I mean, in the sense of how data
which are accessible to the VO are stored and who is responsible for
publishing the data.  There is the other sense of internal observatory
(etc.) storage which will appear in individual data models for ATCA,
LEDAS... but that isn't the VO's problem...

> [R] Location Uncertainty: I don't think this should be with Location.
> The Location is a vague nominal position. You want an uncertainty
> for the detailed individual positions in the data. The ObsData
> provides this via Quantity.
Two different sort of undertainty.  You are talking about the individual
snr-related (or whatever) source-dependent or line-strength-dependent or
whatever, uncertainty - let's call that noise errors; I mean the absolute
or systematic uncertainty due to the error in e.g. the position of an
astrometric calibration source and the phase transfer over the separation
between reference and target (or similar for photometric calibration
uncertainty etc.).  That applies to an entire field of view (or
equivalent) and I think is relevant here.

>
> [R] Mapping: in [V0.2] the ObsData has all the information about
> any pixelization, and the Characterization is exterior to this
> and entirely in terms of world coordinates (it has axes that
> aren't in the data, remember). If you need to go from the
> Characterization to a pixel position you can ask the mappings
> in the ObsData, but I don't think there should be any mappings
> in the characterization.

I am lost... sorry, I don't understand that.  Mapping is in the table in
[V0.2] and I do think that, at least for convenience, it is nice to know
what coordinate system you are in, as that affects the size of a degree on
the x-axis (for example).

>    ========================================
>
> Comments on radio fluxes: the "Jy/beam" problem. I think this is really
> a modelling problem: "Jy/beam", "Jy/pixel" and "Jy/sq arcsec" are not
> the same thing (UCD) with different units, they are different things
> (which is obvious if you allow a variable beam size). So we should not
> expect to handle the distinction with simple unit conversions.

For VO internal use e.g. in the Registry, Jy/arcsec^2 as e.g.  units of
limiting flux is standard for data which can be used to produce images
(and in fact covers any beam size you like from visibility data) - it may
not be exactly accurate but then as you say we want values for guidance at
this point).

Any one data product which is in Jy/beam has a particular beam size and
pixel size even though a range of these are available from the parent
visibility data.  If I want a radio-IR-optical-x-ray SED or a radio/x-ray
flux ratio, the input images or measured flux densities all have to be in
the same units.  This can be achieved by linear conversions (or log in
the case of magnitudes).  This may not be completely accurate for a number
of reasons, mainly x-ray energy profile issues and optical filter leaks
but near the centre of the FoV Jy/beam to Jy/arcsec^2 is pretty linear.
At present e.g. AVO tools can do this sort of thing but can only carry out
conversions on selected optical magnitudes; the user has to supply their
own conversion factor for radio and x-ray, or the input images have to be
in Jy/arcsec^2 or Jy/pixel (some tools expect the former; some expect the
latter and read the pixel size from the FITS header).

However radio astronomers expect their data to be in Jy/beam, so it is not
reasonable to demand that all data are published in Jy/arcsec^2.  This
means that the VO should be able to find the beam size in metadata and
convert where required.

> In the Quantity model, there is a place to put the UCD of the
> observable. We need to have a UCD for 'surface brightness' and a UCD for
> 'flux per resolution element'. Then, software can exist to recognize
> that the conversion between these UCDs requires data with a UCD of
> 'spatial resolution', and that this can be found in the data model as
> part of the characterization. So my key conclusion is that the data must
> be marked (by the data provider or by the data model ingestion software)
> with UCDs or something equivalent to allow conversions of this kind.

OK maybe that would do it, I leave it to the experts.... actually I am
most anxious to see something similar developed for x-ray data since it is
more complicated (and I can't do it in my sleep like I can for radio :)

Thanks very much for the comments Jonathan,

Anita

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Dr. Anita M. S. Richards, AVO Astronomer
MERLIN/VLBI National Facility, University of Manchester,
Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K.
tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax).