[Observation] relation to Dataset
Douglas Tody
dtody at nrao.edu
Fri Nov 22 09:16:09 PST 2013
Hi Arnold -
This is all true enough, although one could argue that some data
products resulting from analysis combining multiple other data products
could be considered a form of "software observation". But the real
reason we stretched the concept a bit in ObsTAP was merely to be able to
provide a single uniform index (the ObsTAP index) for science data
products in an archive.
I agree that the observation-dataset modeling needs to be more
comprehensive; my guess is that this can be done in a relational fashion
by adding one or more additional models/tables to hold the additional
metadata and relationships. The relational model can easily represent
the required many-to-many relationship.
- Doug
On Fri, 22 Nov 2013, Arnold Rots wrote:
> I strongly object to this statement:
>
> "the data product may be the result of combining data from multiple
> primary (physical) observations. In this case the resulting data product
> is a new processed "observation" to which a new unique observation
> identifier should be assigned."
>
> We really need to distinguish clearly between Datasets and Observations.
> An Observation represents an operation that is characterized by a
> configuration
> - instrument characteristics, coordinate volume and properties,
> calibration, etc.
> A Dataset is a container of bytes that may have resulted from an
> Observation
> (the byte stream that came out of the telescope or various direct
> processing
> products of it), a simulation, or the processing and analysis of
> (possibly a subset)
> of one or more parent Datasets.
> Each Dataset also carries metadata detailing coordinate characteristics,
> the nature
> of the Dataset and its components, and its provenance regarding its
> parents.
>
> Blurring the line between Observations and Datasets and carelessly
> forcing one
> to assume the characteristics of the other is going to get us into major
> trouble.
>
> Cheers,
>
> - Arnold
>
> --------------------------------------------------------------------------------
> -----------------------------
> Arnold H. Rots Chandra X-ray
> Science Center
> Smithsonian Astrophysical Observatory tel: +1 617 496
> 7701
> 60 Garden Street, MS 67 fax: +1 617
> 495 7356
> Cambridge, MA 02138
> arots at cfa.harvard.edu
> USA
> http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
> ------------------------------
>
>
>
> On Thu, Nov 21, 2013 at 6:00 PM, CresitelloDittmar, Mark
> <mdittmar at cfa.harvard.edu> wrote:
> All,
>
> I've been thinking about this and some comments Arnold made on the
> Provenance thread which are closely related.
> 1) there is general agreement that Observation *has* 0 or more
> Datasets (rather than *is* a Dataset)
>
> 2) Dataset can exist without an Observation (can be created by
> something else).
>
> 3) The definition of Observation is pretty fuzzy, but lets assume
> that there could be an "Analysis" or "Simulation" step which could
> create a Dataset. These may be parts of the larger domain that all
> these objects live in, but are not modeled. Currently, the ObsCore
> model does say (pg 19) "the data product may be the result of
> combining data from multiple primary (physical) observations. In
> this case the resulting data product is a new processed
> "observation" to which a new unique observation identifier should
> be assigned."
> So the relation of Dataset to 'the thing which created it', is not
> clear to me yet. I keep going back to the 'Experiment' concept in
> Gerard's mail (provenance thread).
>
> I don't think that a Dataset should have a bi-directional relation
> to the full Observation(s) as I noted at the head of this thread,
> but should
> a) have an association back to components of the Observation (
> ObsConfig, Proposal ) which become part of the Dataset
> 'provenance'.
> (which is what I think Arnold was saying in the other
> thread).
> b) have metadata identifying the relevant Observation(s)
> comprising Dataset (DataID.ObservationID), as Francois notes.
> but this gets tricky because ObsCore expects a singular (well
> unique) obs_id for each Dataset.
> c) if the Dataset were created by something else, then it would
> add associations to components of those things holding the relevant
> information to fold into the 'provenance'. Like the progenitor
> Datasets.
>
>
>
>
> On Fri, Nov 15, 2013 at 9:59 AM, Arnold Rots
> <arots at cfa.harvard.edu> wrote:
> If multiple observations have to be taken care of
> through provenance,
> then why should a single observation not be handled the same
> way?
> Don't get me wrong: I think neither should be handled through
> provenance.
>
> Examples are: VLA multi-configuration images; stacked images;
> multi-observation event files.
>
> It is much clearer and more intuitive if we just simply allow
> a Dataset
> to be associated with multiple Observations.
> Actually, I think this is absolutely a requirement.
>
> - Arnold
>
> --------------------------------------------------------------------------------
> -----------------------------
> Arnold H. Rots
> Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory tel:
> +1 617 496 7701
> 60 Garden Street, MS 67
> fax: +1 617 495 7356
> Cambridge, MA 02138
> arots at cfa.harvard.edu
> USA
> http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
> ------------------------------
>
>
>
> On Thu, Nov 14, 2013 at 6:29 PM, Douglas Tody
> <dtody at nrao.edu> wrote:
> On Thu, 14 Nov 2013, Arnold Rots wrote:
>
> >From this description I
> am beginning to suspect
> that a Dataset can be
>
> derived from
> (associated with) no more than one
> Observation.
> That seems utterly wrong; multiple
> Observations can be combined into a
> single Dataset.
> Or did I misunderstand?
>
>
> Multiple Observations can be and often are combined to
> produce a new
> Dataset, however describing that history would be
> likely be the
> responsibility of the Provenance model. At the level
> of Observation it
> would probably be a new "Observation" (or at least
> Dataset). Depends
> upon how strict we are with the concept of Observation.
> The
> CreationType and calibration level say something about
> it being a
> synthesized/derived data product.
>
> I think it is OK to require that a Dataset
> is associated with at least one
> Observation,
> provided that a model or simulation can be
> described as an Observation.
>
>
> In practice that is what we are doing, to keep things
> simple; DataSource
> can be something like "theory".
>
> - Doug
>
> Cheers,
>
> - Arnold
>
> --------------------------------------------------------------------------------
> -----------------------------
> Arnold H. Rots
> Chandra X-ray
> Science Center
> Smithsonian Astrophysical Observatory
> tel: +1 617 496
> 7701
> 60 Garden Street, MS 67
> fax: +1 617
> 495 7356
> Cambridge, MA 02138
> arots at cfa.harvard.edu
> USA
> http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
> ------------------------------
>
>
>
> On Thu, Nov 14, 2013 at 12:08 PM,
> CresitelloDittmar, Mark <
> mdittmar at cfa.harvard.edu> wrote:
>
> All,
> This thread is for discussion
> on the relation between
> Observation and
> Dataset.
>
> ref: ObsCoreDM -
> http://www.ivoa.net/documents/ObsCore/20111028/index.html
> ref: diagram illustrating
> relation of Image/Spectral
> Observation to
> ObsCoreDM (draft)
>
> http://www.ivoa.net/pipermail/dm/attachments/20131113/c9ef7581/attachment-0001.p
> ng
>
> motivation
> It is clear that there is a
> relationship between
> "Observation" and a
> more generic "Dataset". This
> "Dataset" would contain
> elements such as the
> dataProductType, and
> dataProductSubtype, presumably
> others. This object
> has not been formally defined.
>
> In ObsCore, there is an
> implied relationship for
> Observation as an
> Extension of Dataset in the
> location of these attributes.
> So, I have
> always interpreted that
> Observation "is" a Dataset.
> This is reflected in
> my choice of the name
> "ObservationDataset" in the
> left hand package of my
> diagram. It implies that it is
> a Dataset extended for
> Observation purposes.
>
> Recent discussion brings this
> relationship into question,
> with
> assertions that an Observation
> can be associated with 0 or
> more Datasets.
>
> This has real ramifications
> for the Image and Spectral
> models..
>
> Seed:
>
> If the relation is Observation
> "has" 0..* Dataset, then all
> the diagrams
> to date are wrong.
> It feels like this would be a
> fundamental change to all these
> models.
>
> - there would need to be a
> bi-directional relation between
> Observation
> and Dataset
> (observation has 0..*
> Dataset; Dataset associated
> with 1
> Observation)
> Hmm.. since there can be
> Datasets not associated with
> Observations,
> this would
> need to be a specialization
> of Dataset..
> (ObservationDataset.. but not
> the one in my diag.)
>
> - the Char associated with
> Observation would characterize
> the total
> space of all included Datasets.
> (0..1) relation to
> Observation. If no
> Datasets, no Char
>
> - each Dataset would require
> it's own Characterisation,
> specific to it's
> space.
> (so there is another
> attribute for Dataset).
>
> - we would need to specify
> which of the elements are
> associated to the
> Dataset, and which to the
> Observation. e.g. DataModel =>
> Dataset; Target
> => Observation
>
> Thoughts?
> Mark
>
>
>
>
>
>
>
>
>
More information about the dm
mailing list