[Observation] relation to Dataset

Thu Nov 21 15:00:55 PST 2013

All,

I've been thinking about this and some comments Arnold made on the
Provenance thread which are closely related.
  1) there is general agreement that Observation *has* 0 or more Datasets
(rather than *is* a Dataset)

  2) Dataset can exist without an Observation (can be created by something
else).

  3) The definition of Observation is pretty fuzzy, but lets assume that
there could be an "Analysis" or "Simulation" step which could create a
Dataset.  These may be parts of the larger domain that all these objects
live in, but are not modeled.  Currently, the ObsCore model does say (pg
19) "the data product may be the result of combining data from multiple
primary (physical) observations.  In this case the resulting data product
is a new processed "observation" to which a new unique observation
identifier should be assigned."
So the relation of Dataset to 'the thing which created it', is not clear to
me yet.  I keep going back to the 'Experiment' concept in Gerard's mail
(provenance thread).

I don't think that a Dataset should have a bi-directional relation to the
full Observation(s) as I noted at the head of this thread, but should
  a) have an association back to components of the Observation ( ObsConfig,
Proposal ) which become part of the Dataset 'provenance'.
      (which is what I think Arnold was saying in the other thread).
  b) have metadata identifying the relevant Observation(s) comprising
Dataset (DataID.ObservationID), as Francois notes.
      but this gets tricky because ObsCore expects a singular (well unique)
obs_id for each Dataset.
  c) if the Dataset were created by something else, then it would add
associations to components of those things holding the relevant information
to fold into the 'provenance'.  Like the progenitor Datasets.

On Fri, Nov 15, 2013 at 9:59 AM, Arnold Rots <arots at cfa.harvard.edu> wrote:

> If multiple observations have to be taken care of through provenance,
> then why should a single observation not be handled the same way?
> Don't get me wrong: I think neither should be handled through provenance.
>
> Examples are: VLA multi-configuration images; stacked images;
> multi-observation event files.
>
> It is much clearer and more intuitive if we just simply allow a Dataset
> to be associated with multiple Observations.
> Actually, I think this is absolutely a requirement.
>
>   - Arnold
>
>
> -------------------------------------------------------------------------------------------------------------
> Arnold H. Rots                                          Chandra X-ray
> Science Center
> Smithsonian Astrophysical Observatory                   tel:  +1 617 496
> 7701
> 60 Garden Street, MS 67                                      fax:  +1 617
> 495 7356
> Cambridge, MA 02138
> arots at cfa.harvard.edu
> USA
> http://hea-www.harvard.edu/~arots/
>
> --------------------------------------------------------------------------------------------------------------
>
>
>
> On Thu, Nov 14, 2013 at 6:29 PM, Douglas Tody <dtody at nrao.edu> wrote:
>
>> On Thu, 14 Nov 2013, Arnold Rots wrote:
>>
>>  From this description I am beginning to suspect that a Dataset can be
>>>>
>>> derived from
>>> (associated with) no more than one Observation.
>>> That seems utterly wrong; multiple Observations can be combined into a
>>> single Dataset.
>>> Or did I misunderstand?
>>>
>>
>> Multiple Observations can be and often are combined to produce a new
>> Dataset, however describing that history would be likely be the
>> responsibility of the Provenance model.  At the level of Observation it
>> would probably be a new "Observation" (or at least Dataset).  Depends
>> upon how strict we are with the concept of Observation.  The
>> CreationType and calibration level say something about it being a
>> synthesized/derived data product.
>>
>>
>>  I think it is OK to require that a Dataset is associated with at least
>>> one
>>> Observation,
>>> provided that a model or simulation can be described as an Observation.
>>>
>>
>> In practice that is what we are doing, to keep things simple; DataSource
>> can be something like "theory".
>>
>>         - Doug
>>
>>
>>  Cheers,
>>>
>>>  - Arnold
>>>
>>> ------------------------------------------------------------
>>> -------------------------------------------------
>>> Arnold H. Rots                                          Chandra X-ray
>>> Science Center
>>> Smithsonian Astrophysical Observatory                   tel:  +1 617 496
>>> 7701
>>> 60 Garden Street, MS 67                                      fax:  +1 617
>>> 495 7356
>>> Cambridge, MA 02138
>>> arots at cfa.harvard.edu
>>> USA
>>> http://hea-www.harvard.edu/~arots/
>>> ------------------------------------------------------------
>>> --------------------------------------------------
>>>
>>>
>>>
>>> On Thu, Nov 14, 2013 at 12:08 PM, CresitelloDittmar, Mark <
>>> mdittmar at cfa.harvard.edu> wrote:
>>>
>>>  All,
>>>>   This thread is for discussion on the relation between Observation and
>>>> Dataset.
>>>>
>>>> ref: ObsCoreDM - http://www.ivoa.net/documents/
>>>> ObsCore/20111028/index.html
>>>> ref: diagram illustrating relation of Image/Spectral Observation to
>>>> ObsCoreDM (draft)
>>>>
>>>> http://www.ivoa.net/pipermail/dm/attachments/20131113/
>>>> c9ef7581/attachment-0001.png
>>>>
>>>> motivation
>>>>   It is clear that there is a relationship between "Observation" and a
>>>> more generic "Dataset".  This "Dataset" would contain elements such as
>>>> the
>>>> dataProductType, and dataProductSubtype, presumably others.  This object
>>>> has not been formally defined.
>>>>
>>>>   In ObsCore, there is an implied relationship for Observation as an
>>>> Extension of Dataset in the location of these attributes.  So, I have
>>>> always interpreted that Observation "is" a Dataset.  This is reflected
>>>> in
>>>> my choice of the name "ObservationDataset" in the left hand package of
>>>> my
>>>> diagram.  It implies that it is a Dataset extended for Observation
>>>> purposes.
>>>>
>>>>   Recent discussion brings this relationship into question, with
>>>> assertions that an Observation can be associated with 0 or more
>>>> Datasets.
>>>>
>>>>   This has real ramifications for the Image and Spectral models..
>>>>
>>>> Seed:
>>>>
>>>> If the relation is Observation "has" 0..* Dataset, then all the diagrams
>>>> to date are wrong.
>>>> It feels like this would be a fundamental change to all these models.
>>>>
>>>>   - there would need to be a bi-directional relation between Observation
>>>> and Dataset
>>>>        (observation has 0..* Dataset; Dataset associated with 1
>>>> Observation)
>>>>     Hmm.. since there can be Datasets not associated with Observations,
>>>> this would
>>>>     need to be a specialization of Dataset.. (ObservationDataset.. but
>>>> not
>>>> the one in my diag.)
>>>>
>>>>   - the Char associated with Observation would characterize the total
>>>> space of all included Datasets.  (0..1) relation to Observation.  If no
>>>> Datasets, no Char
>>>>
>>>>   - each Dataset would require it's own Characterisation, specific to
>>>> it's
>>>> space.
>>>>     (so there is another attribute for Dataset).
>>>>
>>>>   - we would need to specify which of the elements are associated to the
>>>> Dataset, and which to the Observation.  e.g. DataModel => Dataset;
>>>>  Target
>>>> => Observation
>>>>
>>>> Thoughts?
>>>> Mark
>>>>
>>>>
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dm/attachments/20131121/8a27c70b/attachment-0001.html>