Time Series Cube DM - IVOA Note

Mon Mar 6 18:03:49 CET 2017

Dear Petr, dear all,

      If what we want is basically a scalar observable dependant of Time 
and maybe some other physical axis, the table serialization is indeed 
what we need.

We are facing what the VizieR group had to develop a couple of years ago 
to provide SED built from VizieR content (see joint VOTABLE).
We can indeed imagine that as an NDpoint with 2 or more axes (time+flux 
or any other observable+ ...) is a good representation of each table row.
appropriate utypes for ND cube data model , photometry data model, 
etc... should be sufficient to define the role of each FIELD in the table.
Each NDpoint could also have additional "attributes" (VOTABLE FIELDS) to 
link to progenitor datasets or metadata. We just have to agree on the 
right attribute names for these linking features. I don't agree with the 
idea to see those as some special kind of "axes".

       To be clear : I like the prototype. But I think it should be 
done  with a little bit less complexity (and I'm quite confident it could).
Cheers
François
Le 03/03/2017 à 02:07, Petr Skoda a écrit :
>
>
> Hi all,
>
> Jiri is leaving for a holiday and he could not watch the disussion as 
> he was quite busy and I have just arived after a month of travelling...
>
> I am not sure if Jiri will come to Shangai , I will try to.
>
> I would like just to explain some issue without going to details ...
>
> What is described is implemented in DaCHS and we have adapted SPLAT-VO 
> to work with it - so it shows light curves from our OSPS survey from 
> 1.5 Danish telescope in Chile. (I have shown it several times already 
> at ADASS, Interops and ASTERICS .., but with forced SSAP ) Now it 
> works the same way  on client side using new data model and obscore 
> query (also new window in SPLAT-VO) .... there is a lot of issues to 
> solve but basically it works.
>
> The advantage of representing everything as a table is a possibility 
> to send light curve to TOPCAT and work with individual points (every 
> corresponding "column" may be activated - so we get e.g. original 
> image or its cutout from which the particular stellar aperture was 
> integrated to give the point on light curve. if you send this to 
> Aladin, it starts to download the image .... thanks to Pierre's 
> modification from end 2015.
>
> The main goal was to allow to associate multiple links and multiple 
> metadata with every point. The LSST plans to add to every point the 
> whole probablity distribution function or complex statistical 
> description.
> This is possible in our model.
>
> One important issue is the definition of time series.
>
> I had explicitly stated that the time series is everything which has 
> at least one axis time-dependent - in other words this axis is a 
> FUNCTION of time f(t) ....
> The main idea is to have possibility to mask as a dataproduct type 
> TIMESERIES the Fourier spectrum, power spectrum periodogram etc ...
> and to link them to the  time series. But the function also means that 
> time may be implied or even eliminated !. Very important case is a 
> time axis replaced by the circular phase (folded with given period).
> Or you may have (for machine learning) on x-axis the histogram of 
> various time diferences between individual points.
>
> I would say that 90% of future usage for light curves will be 
> connected with period analysis or some advanced statistics analysis 
> (e.g. wavelet transform, or even machine learning products as Gaussian 
> mixture
>  models -  or associated multi-D errors.
>
> We have followed all available science use cases as collected by CSP 
> (namely Enrique as cited) and tried to find some new not yet mentioned.
>
> But our imagination was limited by the primary goals to describe some 
> kind of linear structrue (in machine learning terms 1D feature vector 
> ) marking a single point with value dependent on a (function of) time. 
> And with every point associated metadata or products of further 
> processing or analysis, or link to previous states of pre-processing 
> up to original data. In principle whole provenance of the single point 
> may be associated here.
>
> But this was a enclosure for our mental concept.
>
> The idea was to give the comunity simple idea how to express the 
> wealth of transients, light curves and period analysis reseults and 
> catalogue them.
>
> Or intention was not to describe the multi-D+1 datacube as a time axis 
> linked to multi_D datacubes. This would bring all problems we had seen 
> with SIAP2 etc ...
>
> We also explicitely state that a physical domain of every axis is not 
> subject of the proposal and particular semantics joined with given 
> domain is the task for other models.
>
> We do not solve this and we do not care .... The client will interpret 
> just what he understands - extending the knowledge about particular 
> contents may be just done by adding some module implementing other model.
>
> Example (somewhat artifical , however...):
>
> The photometric filter will be described in majority of input time 
> series by name - and it is a task for filter profile service to find 
> the particular transmission curve using metadata refering to 
> photometric system (or instrument)
>
> IMHO all users will apreciate if the client will label multiple light 
> curves by the filter names and not complex vectors .....
>
> If some advanced client knows the protocol it may open the picture of 
> transmissivity but better IMHO will be to use SAMP and sending the 
> light curve to another client which will extract the links to filters 
> and displays them..
>
>
> n Thu, 2 Mar 2017, François Bonnarel wrote:
>
>> Dear all,
>>
>>
>> Mireille Louys, Laurent Michel and I   discussed the TimeSeries Cube 
>> data model here in Strasbourg.
>>
>> Before going to serialization we try to go back to the basic concepts 
>> needed to represent TimeSeries and try to match them to Cube Data 
>> model as Jiri did (although we apparently differ eventually)
>>
>>
>> In our approach, we focus on the time axis considering it as 
>> generally irregularly sampled, in other words "sparsed".
>>
>>
>> For each time sample we have a (set of) measurements, which may be 
>> one single flux (in the case of light curves) or whatever scalar 
>> value, but can also be an observations dataset spanned on other data 
>> axes (spectrum, image, radio cube, velocity map....) Actually for 
>> each time sample we have an ND cube (of whatever dimension excluding 
>> time). And if a single data point , or single value (flux) can be 
>> seen as a degenerate case of an ND cube then everything is a set of 
>> NDCubes for different time samples !!!
>>
>>
>>     This concept allows to describe Light curves, time-sequences of 
>> spectra, of 2D-images, of (hyper)cubes.
>
> I am afraid that describing e.g. radio maps at multiple frequencies 
> repeated multiple times (in irregular intervals) is physically 
> feasible but this would bring our model to the position of the 
> ALL-INCLUDING all-VO-describing model of the Universe (and life etc ;-)
>
> Which is beyond my imagination (and implementability) .
>
> I did not want at the begining to immerse this model into data cube, 
> but it was tempting (and Jiri convinced me that it can work after he 
> modified DACHS (in collaboration with Markus who is also guilty as he 
> was the first mentioning Data Cube model at our hackaton in Garching 
> during SCIOPS 2015 workshop).
>
>
>
>>
>>
>> By doing this we are not fully consistent with ND cube data model : 
>> we have something like a mixture between SparseCube and NDImage : the 
>> Time axis is sparsed and each sample on the Time Axis indexes an ND 
>> Cube . It Could be a third specialisation of a generic NDCube ?
>
>
>>>
>>>     >   2) Interoperability
>>>
>>>     Interoperability is actually what this is about.  If we build
>>>     Megamodels doing everything, we either can't evolve the model or 
>>> will
>>>     break all kinds of clients needlessly all the time -- typcially,
>>>     whatever annotation they expect *would* be there, but because their
>>>     positition in the embedding DM changed, they can't find it any 
>>> more.
>
>>>     Client authors will, by the way, quickly figure this out and start
>>>     hacking around it in weird ways, further harming interoperability;
>>>     we've seen it with VOTable, which is what led us to the
>>>     recommendations in the XML versioning note.
>>>
>>>     Keeping individual DMs small and as independent as humanly 
>>> possible,
>>>     even if one has to be incompatibly changed, most other 
>>> functionality
>>>     will just keep working and code won't have to be touched (phewy!).
>
>
> This was our initial idea !!! With mainly SPLAT-VO in mind (yes 
> SPLAT-VO now understands time series)
>
>>>
>>>     I'd argue by pulling all the various aspects into one structure,
>>>     we're following the God object anti-pattern
>>>     (https://en.wikipedia.org/wiki/God_object
>
> Nice !!! the definition is exactly what is most of VO standards about
>
> "that knows too much or does too much"
>
> "its role in the program becomes God-like (all-knowing and 
> all-encompassing) "
>
>
>
>
>>>
>>>     I have to admit that I find the current artefacts for current 
>>> STC on
>>>     volute somewhat hard to figure out. But from what I can see I'd be
>>>     unsure how that binding would help me as a client; that may, of
>>>     course, be because I've not quite understood the pattern.
>
> As I understand - the coordinate system or better space-tiem 
> coordinate system is the most difficult and contraversial part of 
> every VO DM.
>
> My naive view is that :
>
> The STC is required to be able to compare the position and time of 
> occuerence of some transient (e.g Supernova) observed from a 
> satelilite with the same place observed by ground based telescope 
> (e.g. for VOEVENT) Than it is crucial to be able to convert all times 
> and coordinate systems into one one unified as I will query different 
> databases each with its own metadata for coordsys and units.
>
> But in case of publishing time series the main gaol is to study the 
> temporal behaviour of some variable in the same coordinate and time 
> system..  In fact the system is not important - it will be only 
> mentioned at axis label (e.g. by name - HJD (see below ....) or 
> satellite board time ....)  or in legend (when comparing two stars - 
> names in legend...)
>
> I suppose the full processing and transformation of coordsystem will 
> be done during data preparation phase before publishing ....
> A number of important timeseries are light cuves folded with given 
> period.
> This is a label of the particular curve ....
>
> In all cases what is presented is already homogenized dataset which 
> would be printed in a publication. -
>
> The issue with HJD (for Arnold..)   As said we are describing our 
> implementation for DK154 survey .   And here the HJD is required by 
> users as it is a habit in  community of variable stars. The processing 
> pipeline outputs it so it is here.
>
>
>
>>>     from.  What information, in addition to what you get from STC or
>>>     comparable annotation, does your code require, and is there 
>>> really no
>>>     other way to communicate it without having to have a hard link
>>>     between NDCube and STC (or any other "physical" DM, really)?
>
> Exactly - the STC is not main visualizable of the time series. But it 
> may be used when "clicking" on the particular point.
>
>
> I hope I have revealed the motivations of our effort and explained why 
> the current version is not suitable for expresing the whole ALMA 
> observation run ;-) as Francois is already thinking at ....
>
>
> But of course, any help is welcome !
>
> *************************************************************************
> *  Petr Skoda                         Phone : +420-323-649201, ext. 361 *
> *  Stellar Department +420-323-620361           *
> *  Astronomical Institute CAS         Fax   : +420-323-620250           *
> *  251 65 Ondrejov                    e-mail: skoda at sunstel.asu.cas.cz  *
> *  Czech Republic skoda at asu.cas.cz          *
> *************************************************************************

-------------- next part --------------
A non-text attachment was scrubbed...
Name: vizier_sed.xml
Type: text/xml
Size: 44940 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20170306/e71091e3/attachment-0001.xml>