Time Series Cube DM - IVOA Note
Petr Skoda
skoda at sunstel.asu.cas.cz
Fri Mar 3 02:07:48 CET 2017
Hi all,
Jiri is leaving for a holiday and he could not watch the disussion as he
was quite busy and I have just arived after a month of travelling...
I am not sure if Jiri will come to Shangai , I will try to.
I would like just to explain some issue without going to details ...
What is described is implemented in DaCHS and we have adapted SPLAT-VO to
work with it - so it shows light curves from our OSPS survey from 1.5
Danish telescope in Chile. (I have shown it several times already at
ADASS, Interops and ASTERICS .., but with forced SSAP ) Now it works the
same way on client side using new data model and obscore query (also new
window in SPLAT-VO) .... there is a lot of issues to solve but basically
it works.
The advantage of representing everything as a table is a possibility to
send light curve to TOPCAT and work with individual points (every
corresponding "column" may be activated - so we get e.g. original image or
its cutout from which the particular stellar aperture was integrated to
give the point on light curve. if you send this to Aladin, it starts to
download the image .... thanks to Pierre's modification from end 2015.
The main goal was to allow to associate multiple links and multiple
metadata with every point. The LSST plans to add to every point the whole
probablity distribution function or complex statistical description.
This is possible in our model.
One important issue is the definition of time series.
I had explicitly stated that the time series is everything which has at
least one axis time-dependent - in other words this axis is a FUNCTION of
time f(t) ....
The main idea is to have possibility to mask as a dataproduct type
TIMESERIES the Fourier spectrum, power spectrum periodogram etc ...
and to link them to the time series. But the function also means that
time may be implied or even eliminated !. Very important case is a time
axis replaced by the circular phase (folded with given period).
Or you may have (for machine learning) on x-axis the histogram of various
time diferences between individual points.
I would say that 90% of future usage for light curves will be connected
with period analysis or some advanced statistics analysis (e.g. wavelet
transform, or even machine learning products as Gaussian mixture
models - or associated multi-D errors.
We have followed all available science use cases as collected by CSP
(namely Enrique as cited) and tried to find some new not yet mentioned.
But our imagination was limited by the primary goals to describe some kind
of linear structrue (in machine learning terms 1D feature vector ) marking
a single point with value dependent on a (function of) time. And with
every point associated metadata or products of further processing or
analysis, or link to previous states of pre-processing up to original
data. In principle whole provenance of the single point may be associated
here.
But this was a enclosure for our mental concept.
The idea was to give the comunity simple idea how to express the wealth of
transients, light curves and period analysis reseults and catalogue them.
Or intention was not to describe the multi-D+1 datacube as a time axis
linked to multi_D datacubes. This would bring all problems we had seen
with SIAP2 etc ...
We also explicitely state that a physical domain of every axis is not
subject of the proposal and particular semantics joined with given domain
is the task for other models.
We do not solve this and we do not care .... The client will interpret
just what he understands - extending the knowledge about particular
contents may be just done by adding some module implementing other model.
Example (somewhat artifical , however...):
The photometric filter will be described in majority of input time
series by name - and it is a task for filter profile service to find the
particular transmission curve using metadata refering to photometric
system (or instrument)
IMHO all users will apreciate if the client will label multiple light
curves by the filter names and not complex vectors .....
If some advanced client knows the protocol it may open the picture of
transmissivity but better IMHO will be to use SAMP and sending the light
curve to another client which will extract the links to filters and
displays them..
n Thu, 2 Mar 2017, François Bonnarel wrote:
> Dear all,
>
>
> Mireille Louys, Laurent Michel and I discussed the TimeSeries Cube data
> model here in Strasbourg.
>
> Before going to serialization we try to go back to the basic concepts needed
> to represent TimeSeries and try to match them to Cube Data model as Jiri did
> (although we apparently differ eventually)
>
>
> In our approach, we focus on the time axis considering it as generally
> irregularly sampled, in other words "sparsed".
>
>
> For each time sample we have a (set of) measurements, which may be one
> single flux (in the case of light curves) or whatever scalar value, but can
> also be an observations dataset spanned on other data axes (spectrum, image,
> radio cube, velocity map....) Actually for each time sample we have an ND
> cube (of whatever dimension excluding time). And if a single data point , or
> single value (flux) can be seen as a degenerate case of an ND cube then
> everything is a set of NDCubes for different time samples !!!
>
>
> This concept allows to describe Light curves, time-sequences of spectra,
> of 2D-images, of (hyper)cubes.
I am afraid that describing e.g. radio maps at multiple frequencies
repeated multiple times (in irregular intervals) is physically feasible
but this would bring our model to the position of the ALL-INCLUDING
all-VO-describing model of the Universe (and life etc ;-)
Which is beyond my imagination (and implementability) .
I did not want at the begining to immerse this model into data cube, but
it was tempting (and Jiri convinced me that it can work after he modified
DACHS (in collaboration with Markus who is also guilty as he was the first
mentioning Data Cube model at our hackaton in Garching during SCIOPS 2015
workshop).
>
>
> By doing this we are not fully consistent with ND cube data model : we have
> something like a mixture between SparseCube and NDImage : the Time axis is
> sparsed and each sample on the Time Axis indexes an ND Cube . It Could be a
> third specialisation of a generic NDCube ?
>>
>> > 2) Interoperability
>>
>> Interoperability is actually what this is about. If we build
>> Megamodels doing everything, we either can't evolve the model or will
>> break all kinds of clients needlessly all the time -- typcially,
>> whatever annotation they expect *would* be there, but because their
>> positition in the embedding DM changed, they can't find it any more.
>> Client authors will, by the way, quickly figure this out and start
>> hacking around it in weird ways, further harming interoperability;
>> we've seen it with VOTable, which is what led us to the
>> recommendations in the XML versioning note.
>>
>> Keeping individual DMs small and as independent as humanly possible,
>> even if one has to be incompatibly changed, most other functionality
>> will just keep working and code won't have to be touched (phewy!).
This was our initial idea !!! With mainly SPLAT-VO in mind (yes SPLAT-VO
now understands time series)
>>
>> I'd argue by pulling all the various aspects into one structure,
>> we're following the God object anti-pattern
>> (https://en.wikipedia.org/wiki/God_object
Nice !!! the definition is exactly what is most of VO standards about
"that knows too much or does too much"
"its role in the program becomes God-like (all-knowing and
all-encompassing) "
>>
>> I have to admit that I find the current artefacts for current STC on
>> volute somewhat hard to figure out. But from what I can see I'd be
>> unsure how that binding would help me as a client; that may, of
>> course, be because I've not quite understood the pattern.
As I understand - the coordinate system or better space-tiem coordinate
system is the most difficult and contraversial part of every VO DM.
My naive view is that :
The STC is required to be able to compare the position and time of
occuerence of some transient (e.g Supernova) observed from a
satelilite with the same place observed by ground based telescope (e.g.
for VOEVENT) Than it is crucial to be able to convert all times and
coordinate systems into one one unified as I will query different
databases each with its own metadata for coordsys and units.
But in case of publishing time series the main gaol is to study the
temporal behaviour of some variable in the same coordinate and time
system.. In fact the system is not important - it will be only mentioned
at axis label (e.g. by name - HJD (see below ....) or satellite board time
....) or in legend (when comparing two stars - names in legend...)
I suppose the full processing and transformation of coordsystem will be
done during data preparation phase before publishing ....
A number of important timeseries are light cuves folded with given period.
This is a label of the particular curve ....
In all cases what is presented is already homogenized dataset which would
be printed in a publication. -
The issue with HJD (for Arnold..) As said we are describing our
implementation for DK154 survey . And here the HJD is required by users
as it is a habit in community of variable stars. The processing pipeline
outputs it so it is here.
>> from. What information, in addition to what you get from STC or
>> comparable annotation, does your code require, and is there really no
>> other way to communicate it without having to have a hard link
>> between NDCube and STC (or any other "physical" DM, really)?
Exactly - the STC is not main visualizable of the time series. But it may
be used when "clicking" on the particular point.
I hope I have revealed the motivations of our effort and explained why the
current version is not suitable for expresing the whole ALMA observation
run ;-) as Francois is already thinking at ....
But of course, any help is welcome !
*************************************************************************
* Petr Skoda Phone : +420-323-649201, ext. 361 *
* Stellar Department +420-323-620361 *
* Astronomical Institute CAS Fax : +420-323-620250 *
* 251 65 Ondrejov e-mail: skoda at sunstel.asu.cas.cz *
* Czech Republic skoda at asu.cas.cz *
*************************************************************************
More information about the dm
mailing list