Time Domain DAL/DM/TDIG
Petr Skoda
skoda at sunstel.asu.cas.cz
Tue Jul 11 18:22:12 CEST 2017
Hi all,
I could not resist to comment this interesting discussion. Namely after
spending two days at EWASS exoplanet session, giving talk in local radio
about exoplanets and preparing foreword for a book about exoplanets ;-)
I had to study a current state of the art.
But first comment on KDD involvement:
I am sure that Kai will not comment on metadata as I know after many of
our discussions he is not interested AT ALL in any metadata about the
object. The only important item is some ID of given feature vector.
The re-assignment of interesting feature vector back to physical reality
is not in domain of KDD - but logical continuation of scientific work.
The only utmost interest from KDD point of view is a statistic
chracterization (namely uncertainities) of the input data -- and Kai
started to think in probality distribution functions (PDFs) - so the task
of KDD IG is to think how to replace catalogue number (+ stat error) by a
full PDF. It is also requirement of LSST.
Our TS data model is prepared for it (and some LSST people had seen it and
agree - of course it will be discussed still)
But most crucial part of exoplanet bussines is that no data are sure
(with few exceptions). So far most data were obtained from KEPLER (about
4500 exoplanets) and lot of data is still not processed. The transit
method relies on Bayesian analysis of extremely noisy time series which
almost always yields multiple explanations. many discoveries are
accelerated by the aim to be first to announce exoEarth, which IMHO make
some data about exoplanets more speculative than other fundamental
observations of universe ...
So neither planetary characteristics (rocky, gaseous, mass, diameter,
revolving and rotation period, chemical composition ...) nor existence of
multiplanetary exosolar systems can be taken for sure and catalogized as a
one commonly accepted truth.
Moreover number of confirmed exoplanets (about 3500) is still very low and
can fit in Excell spreadsheet. - Such tables will be sure often updated .
I consider it nonsense to speculate about metadata structure for
hierchical storage of planetary system parameters associated with every
star's light curve !
In SSAP there is a class and subclass parameter - but for many uses cases
it is not filled. And even here (e.g. LAMOST) it is dangerous to claim the
object as QSO or STAR for sure but dfinitely the spectra classification -
as M star, B star etc .. is always wrong for a considerable part of
pbjects. But it is helpful to have such
rough classification originating from pipelines available in the object
catalogue ....
So I would suggest to concentrate on ONLY observable characteristics of a
light curve (but also spectra) .
Coordinates are good, however for a lot of objects (and namely exoplanets)
they may not be known - it is part of secret and strongly protected before
the authors are sure. So you hardly will find the HD number of the star -
like HD20794, or other publically known catalogue (Gliese 667, 581)
always people talk about Corot 7, Kepler 11,22, 62 (and planets like 62e,
62f .) , Trapist 1g etc ...
In addition most of the coordinates of exoplanets used to be erased from
FITS headers as well as the exact time of observation (MACHO) (or it was
e.g.
rounded to full minute or even hour). In spectra from OHP Sophie the time
information was modified in spectra of exoplanet hosting stars
intentionally.
In addition the exact possition is difficult to state also for multiple
stars on slit of spectrograph or double blob on CCD frame - so in most
interesting objects the coordinate is not sufficient to identify the
object (even for binary stars - not only exoplanets)
Concerning characteristics of activity - its another problem but key
problem for the future ... most stars have stronger activity either time
variability or RV variability than expected ExoEarth signature.
SO I can imagine the planet hunter wants to query the database to select
calm - low activity G2 class main sequence stars (or e.g. M red dwarf ) to
have new candidates to investigate ...
But the activity index is shaky = e.g. how to describe solar activity ?
What about rapid flare outbursts observed (but very hardly ) on some late
type stars... What about Be stars and QSO outbursts ?
If we return only to observable data:
We may think about amplitude, periodicity - min max value etc ...
So again statistics parameters - But period - typically many periods are
associated with one star....
There is a one issue i was thinking about already many years (IMHO 2007
IVOA in Cambridge when I had seen first idea of the time series in VO
(perhaps from Roy Williams) :
In fact the usage of time series is twofold - something (magnitude, RV
..., flux..) depending on time - product of pipeline processing and
periodogram - filtered by various methods - and from it are derived
periods. And for lot of research goals it is important to fold the light
curve with given periods - its a daily bread of variable star researchers
....
So the important question is the storage of arbitrary number of periods in
a light curve and the client which will make the folding ...
than data are represented not dependent on time but on circular phase
(typically extended from 0 to 2 but also different combinations.
so important metadata for the light curve might be also method used for
period estimate ...
(similar to method of planet discovery...)
But in general I think that light curve should contain just data obtained
and the rest should be in some catalogue and combining TAP and sparse cube
client should allow most use cases ....
(as shown by Laurent)
Petr
I fully agree with Matthew
On Mon, 10 Jul 2017, Matthew Graham wrote:
>
> Object classification is the endpoint of a significant scientific
> workflow and will not necessarily be available for most time series.
> So these
> have to be optional and not really in a minimal list.
> I think we need to be quite careful here about feature creep from
> specific scientific subdomains because that makes things easier for
> them.
More information about the dal
mailing list