Time Series Cube DM - IVOA Note

Thu Mar 16 17:19:36 CET 2017

Jiri,

Thanks for the detailed response.. comments/responses below.

On Mon, Mar 13, 2017 at 6:14 AM, Jiří Nádvorník <nadvornik.ji at gmail.com>
wrote:

> Hi all,
>
>
> @Mark Cressitello Dittmar
>
>>  + Figure 2 and 3
>>       These nearly match cube model Section 4.2..
>>       Time Series cube 'imports' DatsetDM ~= basically identifies it as a
>> SparseCubeDataset
>>       referencing one (only 1?) TimeSeriesCube which is an extension of
>> the SparseCube data product.
>>
>
> We are indeed using Sparse Cube Dataset as a Time Series Cube is a subtype
> of Sparse Cube. The Dataset is a collection of such entities though and
> that means we can store 0..n Time Series Cube entities in the Sparse Cube
> Dataset we are using.
>

>
>>    + Section 3.1
>>       "because we use them as black boxes, it does not matter if these
>> data models change.."
>>       I'm not sure this is true.. or if it is, it is due to glossing over
>> some details.
>>         o it looks like you're saying that a Time Series instance has
>> ObsDataset metadata as
>>            described in 'the Dataset model'.. presumably any version
>> thereof
>>         o but you show the model import.. the vo-dml model import
>> includes the URL to a specific version
>>         o and Figure 4 shows the cube relation of SparseCube with the
>> SparseCubeDataset which
>>            extends the Dataset:ObsDataset object..
>>       It seems important for interoperability to have the explicit
>> relations so that V2.0 of models
>>       can go through a vetting process before being acceptable/useable in
>> subsequent models
>>       which use the V1.0.
>>
>
> Right, this was not spelled entirely correctly it seems. The idea is the
> if the Dataset or VO-DML model changes, we don't mind, because we are not
> extending these, we are not building anything on top of them that would
> break when these models change. Still, we are dependent on them as the Time
> Series Cube DM won't work without them, it's a dependency on their
> existence, not on their form.
>
> OK

> Model import details:
>
>    - Dataset DM - If it changes, we don't mind. We need to store a
>    collection of our cubes, we leave the specification how to do so completely
>    on the Dataset DM though.
>    - VO-DML - We need it for annotating parts of serialization against
>    the entities mentioned below. Anyway, we are dependent on existence of such
>    mapping, not its syntax, as we are only adopting it and not trying to build
>    something on top of it.
>       - Parts of the model (entities defined by the model)
>       - The Data Model itself
>
> Not sure I understand the vo-dml statement.  Surely, if the vo-dml mapping
syntax changes, you would need to change your annotation.  But yes.. you
are using the syntax, not building on it.
OK

> I feel this is a newer concept to the IVOA community and we need to make
> sure we understand it all in the same way. If I am dependent on an external
> model, that means the serialization of my data will change if that model
> changes. That doesn't mean, however, that I need to "embed" it into my data
> model, my data model is not changing if the on I am dependent on changes.
>

Unless it changes at the interfaces..  If I am using element X from model
M, and that element is changed, (as an extreme.. removed), your model may
need to change to accommodate it.
Since that level of change in model M is not backward compatible, it would
be a major version change in M.  Minor version changes in M would not
require any update to your model.. True.

>
>>    + Section 3.1.3
>>       "We are not importing the whole VO-DML"
>>       I'm not sure what this means.. are you saying that this it not
>> attempting to be fully vo-dml compliant?
>>
>
> This means we *use* parts important for us, without trying to build on
> top of them. Effectively we are importing only parts of those models where
> we don't mind the syntax, only the semantics of what we are importing. We
> can discuss this on examples if needed.
>
> Yeah.. maybe that would be better.  I think the vocabulary here needs some
adjusting.  There is nothing to import w.r.t. vo-dml annotation.  The spec
will provide the syntax and you apply that in your serialization.

>    + Section 3.2.1: Cube DM
>>       "however, it is not describing Image Cube DM (as could be
>> erroneously understood from the title)"
>>       What?  pixelated image data is covered by NDImage (section 6 of the
>> cube model)
>>
>
> That is maybe cause by my confusion. Is the data model defined in Figure 6
> in  IVOA N-Dimensional Cube Model
> <https://volute.g-vo.org/svn/trunk/projects/dm/CubeDM-1.0/doc/WD-CubeDM-1.0-20170203.pdf> the
> same one as described in this document IVOA Image Data Model
> <http://wiki.ivoa.net/internal/IVOA/ImageDM/WD-ImageDM-20130812.pdf> ?
>
>
>>    + Section 3.2.3: Image DM
>>       see above..  the cube model covers the full scope of Doug's Image
>> Cube model (2013)
>>
>
> Same question -  is the Figure 6 in N-Dimensional Cube Model describing
> the same data model as Figure 2 in IVOA Image Data Model?
>
>
Yes.. the N-Dimensional Cube model is the continuation of that effort..
when I took over that project it was folded into the effort of extracting
DatasetMetadata from these Product models and forming the family of Cube,
Dataset, STC( coordsys, coords, trans ); and using vo-dml compliant
modeling.   In that process, since the word 'Image' invokes the 'pixelated
image type', and the model covers a broader scope than that, it was renamed
Cube.  The Cube model history includes the ImageDM iterations.

>
>    + Section 4/Figure 4
>>       your description of the TimeSeriesCube aligns pretty well with the
>> SparseCube as it is..
>>       I'm not sure it is necessary to 'override' the content (btw you
>> could just extend "PointDataProduct")
>>       o the cube model SparseCube has a collecton of NDPoints (ie rows),
>> which contains a collection
>>          of DataAxis, each referring to an instance of a measurement type
>> coordinate (value + errors)
>>          * your representation and mine simply reverse the row/column
>> order.
>>          * in cube, the Observable is any instance of Coordinate which
>> follows the pattern described
>>            in STC2-coords model.  The modeling of that instance/domain
>> does NOT have to be in
>>            any particular model.. so the Axis Domain DMs scenario you
>> show works fine.
>>          * But.. for interoperability sake, we do require them to use the
>> same pattern (by linking
>>            Observable to the abstract coords:DerivedCoordinate which is
>> bound to a pattern)
>>
>
> Yes, it is pretty similar - we used it as an inspiration and believe
> originally that we will just build upon it without changing it. Problems
> and difficulties listed beneath.
>
>    1. Axis Domain models - physical meaning of the data in the axis of
>    the data cube should not be part of the cube model. The Frame mappings and
>    CoordSys entities taken from STC are also such domain models IMHO.
>
> The cube model does not have any description of domain content.. (other
than Pixel).
The Mappings and CoordSys objects are containers for pan-domain information
defined in the respective models (coordsys, trans).

>
>    1. Row/column inversion - this is a bigger difference than it seems on
>    the first look. Logically we are storing the same information but I don't
>    want to explicitely store a reference to an axis in every single point of
>    the data cube. By this the Sparse Cube Model is IMHO saying that the data
>    can be unordered and so we need to store the reference for every point?
>    Doing it the other way around, the axis can just specify where I can find
>    the axis coordinates in the data element of the cube, no matter how we
>    serialize it.
>
> I think the annotation should end up pretty similar, except that Row-based
will sit in a grouping for the NDPoint object, while column based will not.
The axis reference is stored in the template next to the COLUMN spec.  Each
iteration of the Template (table row) fills in the appropriate elements of
the template.. in my case the values and errors of the DataAxis observables
in an NDPoint.. in the column-based approach each iteration will fill in
the next element of the the observables.. outside of an NDPoint.

>    1. What is the Observable entity's purpose in the diagram? Please
>    explain..
>
> Figure 2, 6, 7. Observable entity is a collection of Coordinates
associated with the data product.  This is something to make sure is
explained well.
The premise is that a DataProduct should OWN all of its coordinates/data.
The vo-dml rules for composition state that a class/object may not be in
more than one composition relation.
  + so one does not make composition relations across models.
  + it defines an aggregation pattern to use  (as shown for DataProduct ->
Observable -> DerivedObject ) for these situations

Since there are multiple types of Data Axis types, I modeled it this way..
where the DataProduct owns ALL its data (Observables), and the data axis
types (DataAxis, DependentAxis) are organizational objects which refer to
the instances of the same axis.

This could be organized differently.. having the Observables owned by the
DataAxis (which is directly or indirectly owned by the DataProduct), and
extend that for various types of axis.. adding constraints as needed.  The
sub-classes of DataAxis would then each be in composition with some
container (NDPoint, Voxel).  I don't know if this would be
better/worse/indifferent.

     <DataAxis> O---------- Cell  (constrain mult=1)
         +              |__ DependentAxis (constrain mult=1)
         |              |__ Column ??
         V  *
     Observable
         |
         V
   coords:DerivedCoordiante

I want to note one distinction.  The DataAxis here, is NOT the same as a
coordinate space axis.
If I have a 3D cartesian Space, with coordinate axes x,y,z.. there is 1
DataAxis referring to a Position3D in that space.
I think this maps well to statements below.. where the 'Observable' maybe
need not be a DerivedCoordinate, but allow for complex objects like
"Spectrum"

    + Section 4.2.2
>>        cube model does not have dependency on specifc axis domains any
>> more (since ~Nov)
>
>
> What about those FrameMappings and CoordSys entities? These are storing
> physical meaning of the data, not just describing the structure of the data
> (the data cube) itself.
>
>>
>>
Yes, they are containers for storing the physical meaning of the data.  The
content of a Cube or pixelated image is not devoid of physical meaning, if
a sparse cube has data axes, each of these are of some physical quantity..
these elements allow you to describe that content without imposing
restrictions about specific domains.

 <snip>

So, I see we have 2 points of discussion for the cube model itself
  1) relation between Dataset and DataProduct
      Currently modeled as according to Section 3.. extend Dataset add
reference to DataProduct == MyDataset

      Alternates include:
        a) loose coupling
            verbal statement that MyDataset includes an instance of Dataset
+ instance of MyDataProduct
        b) referenced coupling
            MyDataSet == reference to Dataset + reference to MyDataProduct
            (allows validators to know what is expected, but allows
flexibility w.r.t. Dataset flavor )

     I personally think a) is too loose, but b) might be a good way to go..

  2) relation of DataProduct, DataAxis and the 'Observable'
      I haven't explored the alternative shown above very much, but like
the possibility
      In the current relation, the DataProduct-Observable relation is a
potential source of confusion to readers.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20170316/817c03fc/attachment-0001.html>