[Observation] relation to Dataset

Fri Nov 22 09:16:09 PST 2013

Hi Arnold -

This is all true enough, although one could argue that some data
products resulting from analysis combining multiple other data products
could be considered a form of "software observation".  But the real
reason we stretched the concept a bit in ObsTAP was merely to be able to
provide a single uniform index (the ObsTAP index) for science data
products in an archive.

I agree that the observation-dataset modeling needs to be more
comprehensive; my guess is that this can be done in a relational fashion
by adding one or more additional models/tables to hold the additional
metadata and relationships.  The relational model can easily represent
the required many-to-many relationship.

 	- Doug

On Fri, 22 Nov 2013, Arnold Rots wrote:

> I strongly object to this statement:
> 
> "the data product may be the result of combining data from multiple
> primary (physical) observations.  In this case the resulting data product
> is a new processed "observation" to which a new unique observation
> identifier should be assigned."
> 
> We really need to distinguish clearly between Datasets and Observations.
> An Observation represents an operation that is characterized by a
> configuration
> - instrument characteristics, coordinate volume and properties,
> calibration, etc.
> A Dataset is a container of bytes that may have resulted from an
> Observation
> (the byte stream that came out of the telescope or various direct
> processing
> products of it), a simulation, or the processing and analysis of
> (possibly a subset)
> of one or more parent Datasets.
> Each Dataset also carries metadata detailing coordinate characteristics,
> the nature
> of the Dataset and its components, and its provenance regarding its
> parents.
> 
> Blurring the line between Observations and Datasets and carelessly
> forcing one
> to assume the characteristics of the other is going to get us into major
> trouble.
> 
> Cheers,
>
>   - Arnold
> 
> --------------------------------------------------------------------------------
> -----------------------------
> Arnold H. Rots                                          Chandra X-ray
> Science Center
> Smithsonian Astrophysical Observatory                   tel:  +1 617 496
> 7701
> 60 Garden Street, MS 67                                      fax:  +1 617
> 495 7356
> Cambridge, MA 02138 
> arots at cfa.harvard.edu
> USA 
> http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
> ------------------------------
> 
> 
> 
> On Thu, Nov 21, 2013 at 6:00 PM, CresitelloDittmar, Mark
> <mdittmar at cfa.harvard.edu> wrote:
>       All,
> 
> I've been thinking about this and some comments Arnold made on the
> Provenance thread which are closely related.
>   1) there is general agreement that Observation *has* 0 or more
> Datasets  (rather than *is* a Dataset)
>
>   2) Dataset can exist without an Observation (can be created by
> something else).
>
>   3) The definition of Observation is pretty fuzzy, but lets assume
> that there could be an "Analysis" or "Simulation" step which could
> create a Dataset.  These may be parts of the larger domain that all
> these objects live in, but are not modeled.  Currently, the ObsCore
> model does say (pg 19) "the data product may be the result of
> combining data from multiple primary (physical) observations.  In
> this case the resulting data product is a new processed
> "observation" to which a new unique observation identifier should
> be assigned."
> So the relation of Dataset to 'the thing which created it', is not
> clear to me yet.  I keep going back to the 'Experiment' concept in
> Gerard's mail (provenance thread).
> 
> I don't think that a Dataset should have a bi-directional relation
> to the full Observation(s) as I noted at the head of this thread,
> but should
>   a) have an association back to components of the Observation (
> ObsConfig, Proposal ) which become part of the Dataset
> 'provenance'.
>       (which is what I think Arnold was saying in the other
> thread).
>   b) have metadata identifying the relevant Observation(s)
> comprising Dataset (DataID.ObservationID), as Francois notes.
>       but this gets tricky because ObsCore expects a singular (well
> unique) obs_id for each Dataset.
>   c) if the Dataset were created by something else, then it would
> add associations to components of those things holding the relevant
> information to fold into the 'provenance'.  Like the progenitor
> Datasets.
> 
> 
> 
> 
> On Fri, Nov 15, 2013 at 9:59 AM, Arnold Rots
> <arots at cfa.harvard.edu> wrote:
>       If multiple observations have to be taken care of
>       through provenance,
> then why should a single observation not be handled the same
> way?
> Don't get me wrong: I think neither should be handled through
> provenance.
> 
> Examples are: VLA multi-configuration images; stacked images;
> multi-observation event files.
> 
> It is much clearer and more intuitive if we just simply allow
> a Dataset
> to be associated with multiple Observations.
> Actually, I think this is absolutely a requirement.
>
>   - Arnold
> 
> --------------------------------------------------------------------------------
> -----------------------------
> Arnold H. Rots 
> Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory                   tel: 
> +1 617 496 7701
> 60 Garden Street, MS 67 
> fax:  +1 617 495 7356
> Cambridge, MA 02138 
> arots at cfa.harvard.edu
> USA 
> http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
> ------------------------------
> 
> 
> 
> On Thu, Nov 14, 2013 at 6:29 PM, Douglas Tody
> <dtody at nrao.edu> wrote:
>       On Thu, 14 Nov 2013, Arnold Rots wrote:
>
>                   >From this description I
>                   am beginning to suspect
>                   that a Dataset can be
>
>             derived from
>             (associated with) no more than one
>             Observation.
>             That seems utterly wrong; multiple
>             Observations can be combined into a
>             single Dataset.
>             Or did I misunderstand?
> 
> 
> Multiple Observations can be and often are combined to
> produce a new
> Dataset, however describing that history would be
> likely be the
> responsibility of the Provenance model.  At the level
> of Observation it
> would probably be a new "Observation" (or at least
> Dataset).  Depends
> upon how strict we are with the concept of Observation.
>  The
> CreationType and calibration level say something about
> it being a
> synthesized/derived data product.
>
>       I think it is OK to require that a Dataset
>       is associated with at least one
>       Observation,
>       provided that a model or simulation can be
>       described as an Observation.
> 
> 
> In practice that is what we are doing, to keep things
> simple; DataSource
> can be something like "theory".
>
>         - Doug
>
>       Cheers,
>
>        - Arnold
> 
> --------------------------------------------------------------------------------
>       -----------------------------
>       Arnold H. Rots
>                    Chandra X-ray
>       Science Center
>       Smithsonian Astrophysical Observatory
>                   tel:  +1 617 496
>       7701
>       60 Garden Street, MS 67
>                        fax:  +1 617
>       495 7356
>       Cambridge, MA 02138
>       arots at cfa.harvard.edu
>       USA
>       http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
>       ------------------------------
> 
> 
>
>       On Thu, Nov 14, 2013 at 12:08 PM,
>       CresitelloDittmar, Mark <
>       mdittmar at cfa.harvard.edu> wrote:
>
>             All,
>               This thread is for discussion
>             on the relation between
>             Observation and
>             Dataset.
>
>             ref: ObsCoreDM -
>             http://www.ivoa.net/documents/ObsCore/20111028/index.html
>             ref: diagram illustrating
>             relation of Image/Spectral
>             Observation to
>             ObsCoreDM (draft)
> 
> http://www.ivoa.net/pipermail/dm/attachments/20131113/c9ef7581/attachment-0001.p
>             ng
>
>             motivation
>               It is clear that there is a
>             relationship between
>             "Observation" and a
>             more generic "Dataset".  This
>             "Dataset" would contain
>             elements such as the
>             dataProductType, and
>             dataProductSubtype, presumably
>             others.  This object
>             has not been formally defined.
>
>               In ObsCore, there is an
>             implied relationship for
>             Observation as an
>             Extension of Dataset in the
>             location of these attributes.
>              So, I have
>             always interpreted that
>             Observation "is" a Dataset.
>              This is reflected in
>             my choice of the name
>             "ObservationDataset" in the
>             left hand package of my
>             diagram.  It implies that it is
>             a Dataset extended for
>             Observation purposes.
>
>               Recent discussion brings this
>             relationship into question,
>             with
>             assertions that an Observation
>             can be associated with 0 or
>             more Datasets.
>
>               This has real ramifications
>             for the Image and Spectral
>             models..
>
>             Seed:
>
>             If the relation is Observation
>             "has" 0..* Dataset, then all
>             the diagrams
>             to date are wrong.
>             It feels like this would be a
>             fundamental change to all these
>             models.
>
>               - there would need to be a
>             bi-directional relation between
>             Observation
>             and Dataset
>                    (observation has 0..*
>             Dataset; Dataset associated
>             with 1
>             Observation)
>                 Hmm.. since there can be
>             Datasets not associated with
>             Observations,
>             this would
>                 need to be a specialization
>             of Dataset..
>             (ObservationDataset.. but not
>             the one in my diag.)
>
>               - the Char associated with
>             Observation would characterize
>             the total
>             space of all included Datasets.
>              (0..1) relation to
>             Observation.  If no
>             Datasets, no Char
>
>               - each Dataset would require
>             it's own Characterisation,
>             specific to it's
>             space.
>                 (so there is another
>             attribute for Dataset).
>
>               - we would need to specify
>             which of the elements are
>             associated to the
>             Dataset, and which to the
>             Observation.  e.g. DataModel =>
>             Dataset;  Target
>             => Observation
>
>             Thoughts?
>             Mark
> 
> 
> 
> 
> 
> 
> 
> 
>