[Observation] relation to Dataset

Thu Nov 21 15:31:46 PST 2013

On Thu, 21 Nov 2013, CresitelloDittmar, Mark wrote:

> All,
> 
> I've been thinking about this and some comments Arnold made on the
> Provenance thread which are closely related.
>   1) there is general agreement that Observation *has* 0 or more
> Datasets  (rather than *is* a Dataset)
>
>   2) Dataset can exist without an Observation (can be created by
> something else).
>
>   3) The definition of Observation is pretty fuzzy, but lets assume that
> there could be an "Analysis" or "Simulation" step which could create a
> Dataset.  These may be parts of the larger domain that all these objects
> live in, but are not modeled.  Currently, the ObsCore model does say (pg
> 19) "the data product may be the result of combining data from multiple
> primary (physical) observations.  In this case the resulting data product
> is a new processed "observation" to which a new unique observation
> identifier should be assigned."
> So the relation of Dataset to 'the thing which created it', is not clear
> to me yet.  I keep going back to the 'Experiment' concept in Gerard's
> mail (provenance thread).
> 
> I don't think that a Dataset should have a bi-directional relation to the
> full Observation(s) as I noted at the head of this thread, but should
>   a) have an association back to components of the Observation (
> ObsConfig, Proposal ) which become part of the Dataset 'provenance'.
>       (which is what I think Arnold was saying in the other thread).
>   b) have metadata identifying the relevant Observation(s) comprising
> Dataset (DataID.ObservationID), as Francois notes.
>       but this gets tricky because ObsCore expects a singular (well
> unique) obs_id for each Dataset.

I was with you up to here.  Obs_ID does *not* have to be unique for each
dataset - the pubDID is what has to be unique.  Multiple datasets may
share the same obs_id; this is an essential feature of ObsCore.

 	- Doug

>   c) if the Dataset were created by something else, then it would add
> associations to components of those things holding the relevant
> information to fold into the 'provenance'.  Like the progenitor Datasets.
> 
> 
> 
> 
> On Fri, Nov 15, 2013 at 9:59 AM, Arnold Rots <arots at cfa.harvard.edu>
> wrote:
>       If multiple observations have to be taken care of through
>       provenance,
> then why should a single observation not be handled the same way?
> Don't get me wrong: I think neither should be handled through
> provenance.
> 
> Examples are: VLA multi-configuration images; stacked images;
> multi-observation event files.
> 
> It is much clearer and more intuitive if we just simply allow a
> Dataset
> to be associated with multiple Observations.
> Actually, I think this is absolutely a requirement.
>
>   - Arnold
> 
> --------------------------------------------------------------------------------
> -----------------------------
> Arnold H. Rots                                          Chandra
> X-ray Science Center
> Smithsonian Astrophysical Observatory                   tel:  +1
> 617 496 7701
> 60 Garden Street, MS 67                                      fax: 
> +1 617 495 7356
> Cambridge, MA 02138 
> arots at cfa.harvard.edu
> USA 
> http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
> ------------------------------
> 
> 
> 
> On Thu, Nov 14, 2013 at 6:29 PM, Douglas Tody <dtody at nrao.edu>
> wrote:
>       On Thu, 14 Nov 2013, Arnold Rots wrote:
>
>                   >From this description I am
>                   beginning to suspect that a
>                   Dataset can be
>
>             derived from
>             (associated with) no more than one
>             Observation.
>             That seems utterly wrong; multiple
>             Observations can be combined into a
>             single Dataset.
>             Or did I misunderstand?
> 
> 
> Multiple Observations can be and often are combined to
> produce a new
> Dataset, however describing that history would be likely be
> the
> responsibility of the Provenance model.  At the level of
> Observation it
> would probably be a new "Observation" (or at least Dataset).
>  Depends
> upon how strict we are with the concept of Observation.  The
> CreationType and calibration level say something about it
> being a
> synthesized/derived data product.
>
>       I think it is OK to require that a Dataset is
>       associated with at least one
>       Observation,
>       provided that a model or simulation can be
>       described as an Observation.
> 
> 
> In practice that is what we are doing, to keep things simple;
> DataSource
> can be something like "theory".
>
>         - Doug
>
>       Cheers,
>
>        - Arnold
> 
> --------------------------------------------------------------------------------
>       -----------------------------
>       Arnold H. Rots
>              Chandra X-ray
>       Science Center
>       Smithsonian Astrophysical Observatory
>             tel:  +1 617 496
>       7701
>       60 Garden Street, MS 67
>                  fax:  +1 617
>       495 7356
>       Cambridge, MA 02138
>       arots at cfa.harvard.edu
>       USA
>       http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------------
>       ------------------------------
> 
> 
>
>       On Thu, Nov 14, 2013 at 12:08 PM,
>       CresitelloDittmar, Mark <
>       mdittmar at cfa.harvard.edu> wrote:
>
>             All,
>               This thread is for discussion on
>             the relation between Observation and
>             Dataset.
>
>             ref: ObsCoreDM -
>             http://www.ivoa.net/documents/ObsCore/20111028/index.html
>             ref: diagram illustrating relation of
>             Image/Spectral Observation to
>             ObsCoreDM (draft)
> 
> http://www.ivoa.net/pipermail/dm/attachments/20131113/c9ef7581/attachment-0001.p
>             ng
>
>             motivation
>               It is clear that there is a
>             relationship between "Observation"
>             and a
>             more generic "Dataset".  This
>             "Dataset" would contain elements such
>             as the
>             dataProductType, and
>             dataProductSubtype, presumably
>             others.  This object
>             has not been formally defined.
>
>               In ObsCore, there is an implied
>             relationship for Observation as an
>             Extension of Dataset in the location
>             of these attributes.  So, I have
>             always interpreted that Observation
>             "is" a Dataset.  This is reflected in
>             my choice of the name
>             "ObservationDataset" in the left hand
>             package of my
>             diagram.  It implies that it is a
>             Dataset extended for Observation
>             purposes.
>
>               Recent discussion brings this
>             relationship into question, with
>             assertions that an Observation can be
>             associated with 0 or more Datasets.
>
>               This has real ramifications for the
>             Image and Spectral models..
>
>             Seed:
>
>             If the relation is Observation "has"
>             0..* Dataset, then all the diagrams
>             to date are wrong.
>             It feels like this would be a
>             fundamental change to all these
>             models.
>
>               - there would need to be a
>             bi-directional relation between
>             Observation
>             and Dataset
>                    (observation has 0..* Dataset;
>             Dataset associated with 1
>             Observation)
>                 Hmm.. since there can be Datasets
>             not associated with Observations,
>             this would
>                 need to be a specialization of
>             Dataset.. (ObservationDataset.. but
>             not
>             the one in my diag.)
>
>               - the Char associated with
>             Observation would characterize the
>             total
>             space of all included Datasets.
>              (0..1) relation to Observation.  If
>             no
>             Datasets, no Char
>
>               - each Dataset would require it's
>             own Characterisation, specific to
>             it's
>             space.
>                 (so there is another attribute
>             for Dataset).
>
>               - we would need to specify which of
>             the elements are associated to the
>             Dataset, and which to the
>             Observation.  e.g. DataModel =>
>             Dataset;  Target
>             => Observation
>
>             Thoughts?
>             Mark
> 
> 
> 
> 
> 
> 
> 
>