[timeDomain: model for Time series] discussion on Timeseries Data model Note / ND-point .... or not ??? !!!

Tue Jul 18 09:01:37 CEST 2017

Hi,

Inline.

Cheers,

Jiri

From: François Bonnarel [mailto:francois.bonnarel at astro.unistra.fr] 
Sent: Monday, July 17, 2017 9:57 AM
To: Jiří Nádvorník <nadvornik.ji at gmail.com>; mireille.louys at unistra.fr; dm at ivoa.net; voevent at ivoa.net; dal at ivoa.net
Subject: Re: [timeDomain: model for Time series] discussion on Timeseries Data model Note / ND-point .... or not ??? !!!

Thanks, Jiri

Again let's clarify what we think. see in text

Le 13/07/2017 à 16:03, Jiří Nádvorník a écrit :

Hi all,

Thank you for nice summary, Francois.

My comments inline.

Cheers,

Jiri 

From: dm-bounces at ivoa.net <mailto:dm-bounces at ivoa.net>  [mailto:dm-bounces at ivoa.net] On Behalf Of François Bonnarel
Sent: Wednesday, July 12, 2017 6:11 PM
To: mireille.louys at unistra.fr <mailto:mireille.louys at unistra.fr> ; dm at ivoa.net <mailto:dm at ivoa.net> ; voevent at ivoa.net <mailto:voevent at ivoa.net> ; dal at ivoa.net <mailto:dal at ivoa.net> 
Subject: [timeDomain: model for Time series] discussion on Timeseries Data model Note / ND-point .... or not ??? !!!

Dear all,

This is a follow-up of these two emails

http://mail.ivoa.net/pipermail/dm/2017-July/005583.html

and

http://mail.ivoa.net/pipermail/dm/2017-July/005594.html

which are discussing chapter 4 of TimeSeries Data Model IVOA note within the scope of the TimeDomain effort summarized by Ada there:

http://mail.ivoa.net/pipermail/dm/2017-July/005581.html

Discussed issue:  do we need ND-points or not ?

Reference : Laurent's diagram in email (http://mail.ivoa.net/pipermail/dm/2017-July/005581.html) and CubeDM draft  from http://volute.g-vo.org/svn/trunk/projects/dm/CubeDM-1.0/doc/WD-CubeDM-1.0-20170203.pdf (last version)

In figure 4 of Jiri's note, SparseCube doesn't relate to the ND-point class as it is the case in the IVOA sparse cube data model project.

[[Jiri Nadvornik]] The problem we had with the ND-point is that it holds complete metadata for each individual point of the time series, so also the statistical distribution would need to go here (which is not really related to a point, but rather to an axis). And the metadata about spectral points or photometry points can be kept in the Spectral DM or Photometry DM metadata, so in the end we realized that the ND-point class was empty.

Again, I'm trying to start from what is in the CubeDM draft. In my understanding ND-point doesn't seem to contain any axis metadata and is only gathering a set of individual values contained in the DataAxis instances. If we have a "tabular vision" of our TimeSeries ND-point is modeling the content of a row. In an ND-Point instnace we will have the Time DataAxis value and any other corresponding DataAxis values : Flux, magnitude , Position + velocity, etc... 

[[Jiri Nadvornik]] I have cut the part of TimeSeriesCube DM that copes with this – attached TimeSeriesCube DM detail. To put the „tabular vision“ to the context there – the columnRef is representing table columns (1 column is 1 axis) and the TimeSeriesCube class is collecting these axes while knowing whether they are dependent or independent on the time axis. So while not keeping relationships between each individual axis, we are keeping the information about their dependency.

The ND-point class gathers several DataAxis (or Observable) containers for measurements on a given data Axis to represent a "point" or "event" in the data space.

[[Jiri Nadvornik]] My understanding here was that it gathers several one ND-point gathers 1 *point* from each DataAxis, not the whole DataAxis – can somebody please verify one or the other?

Yes ND-point gather 1 value from each DataAxis

On Jiri's figure a set of "CubeAxis" is directly related to "SparseCube". This doesn't imply explicitly a relationship between each of these "CubeAxis" instances and doesn't even imply that each of these "CubeAxis" will have the same number of instances.

[[Jiri Nadvornik]] Correct. We did not consider relationships between individual axes. While photometry axis will usually have the same amout of points as the time axis, we will have only several bands on the spectral axis. Can also the spatial axis have fewer elements than the time axis (I don’t have different spatial coordinates for every point of the light curve)?

Yes you may not have for coordinates. But at least one the other DataAxis than time should be sampled and dependant from time. Actually ND-point is modeling this dependance

[[Jiri Nadvornik]]  If the ND-point is holding just points and doesn’t know about axes and their metadata, it does not know about any dependency – that should be for the Cube to decide, because it knows the context.. As you described it, the ND-point seems to me just a data structure that I use for the serialization inside a table (the row, putting it somewhere below the ColumnRef in the attached image), but the question is whether this should still be part of the data model or it’s putting too much restriction..

E.g., the spectral axis can have rather complex points (min, max, midpoint wavelength, band_name to start with) – so should all of these be part of the ND-point or just a reference (band_name) while storing the rest of the information elsewhere in the serialization (to not store duplicate data in the ND-points)? If we say that the ND-point is mandatory, we kill these better data structures right away..

So I am wondering if we should not reintroduce the ND-point feature between "SparseCube" and "CubeAxis" or "DataAxis". What do you think ?

[[Jiri Nadvornik]] My suggestion would be to have the Cube DM hold metadata for axis statistical distribution

ObsDataSet and SparseCubeDataset allready have a lot of general metadata (for example the characterisation is part of ObsDataSet, and the cordinate Mapping and System is in SparseCubeDataSet. If we need per axis statistics it should be added somewhere there, I guess

[[Jiri Nadvornik]] Well, modelling a generic statistical distribution won’t be an easy task and the resulting data model might become pretty complex. Why not start it as a separate effort so we don’t need to wait and synchronize it with such a huge model as ObsDataSet?

Cheers
François

(see Quantity class in TimeSeriesCube UML - https://volute.g-vo.org/viewvc/volute/trunk/projects/time-domain/time-series/time-series-cube/ivoa-note-1.0/ )  and the metadata about the measurements in what we already have – Photometry DM, spatial and time in STC, Spectral DM… but both types of metadata separately for each axis, not for one ND-Point class.

François (after discussions with Laurent, Mireille and Ada on this topic)

Le 12/07/2017 à 10:57, Mireille Louys a écrit :

Dear DM and Time Domain followers, 

I am trying, together with my CDS colleagues,  to recap on the various DMs available in the IVOA and understand the possible links between the future Time Series Model ( as sketched in Jiris's Note) and existing DMs like ND-Cube and STC 2.

Here is a graph proposed by Laurent Michel to clarify the links in 3 main parts : 

*	DataSetMetadata DM, which has the main ObsDataset Class ,
*	ND-CubeDM, which defines a SparseCubedataset
*	TimeSerieCubeDM, which highlights the special properties of a Cube depending on a Time axis

I think this is essential to highlight the inheritance path between these 3 DM building blocks: 
a TimeSeriesCube  <is a >  NDCubeDM::SparseCubeDataset
a NDCubeDM::SparseCubeDataset <is a >  DatasetMetadaDM::ObsDataset

ObsDataset has a dataproduct_type attribute which allows to discover all dataproducts of type ' timeseries'. 
this provides the container object for time-dependent data.

If we need to select timeseries dataproducts according to some properties extracted from their data we can:
 - reuse what Obscore DM provides to explain general axes properties
target_name, s_region, s_resol, t_min, t_max, t_resol, em_min, em_max, em_resol, etc. are the basic properties for discovery

 - provide a richer description of the TimeAxis and ObservableAxis. 
For that , extracting  a statistical profile from the data contained in the Cube could do the job. 
this means to access and analyse the Data part in ND-Cube , i. e the ND-Points gathered in a SparseCube Object

I guess more properties can be exposed to qualify the axes present in the Timeseries dataset , but for the moment , I see some overlap of notions between 
CharacterisationDM::ObservableAxis,  STC2.0::CoordMeasurement (??) and TimeSerieCubeDM::CubeAxis.

This would be great if we could sort this out, 
but currently , I would appreciate your feedback on the attached diagram , in order to proceed on the data model structure. 

Cheers, Mireille ( after discussions together with Laurent, François, Ada) 

-- 
--
Mireille Louys
CDS                                            Laboratoire Icube 
Observatoire de Strasbourg     Telecom Physique Strasbourg
11 rue de l'Université         300, Bd Sebastien Brandt CS 10413             
F- 67000-STRASBOURG                    F-67412 ILLKIRCH Cedex
tel: +33 3 68 85 24 34

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/voevent/attachments/20170718/a6b1382e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TimeSeriesCube DM detail.png
Type: image/png
Size: 48848 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/voevent/attachments/20170718/a6b1382e/attachment-0001.png>