[VEP-0001] DataLink semantics vocabulary enhacement proposal
Petr Skoda
skoda at sunstel.asu.cas.cz
Wed Oct 23 14:01:02 CEST 2019
I will be now quite provocative concerning the omnipotence of proper
semantics ;-)
All we try to do is to input the intelligence into the VO tools - they
should be clever to prevent the user to do wrong things (e.g. sending
catalogue table to SPLAT). So it decides for him not to allow to do this.
But it is always useful if you can switch off the (artificial )
intelligence nad use manual override (see autopilots in airplanes, secure
cars, ABS ...) This is not current case of VO clients - the decision
algorithms are hidden hard-coded deep in tools (without direct contact
with Pierre, Mark and Margarida I would not be able to understand why
certain tables do not do what expected after receiving/send by SAMP -
answer was always - I am looking for certain collumn name, ucd, utype ...
in a table to decide what to do).
But what if the user wants to display catalogue of spectra - so it is the
catalogue of objects where one collumn is the accref to spectra . This can
be easily loaded instead of the internally generated Results from SSA
query (the difference is: the table may cover whole sky - so spatial
restriction imposed by SSA query is not introduced, or the spectra may
be result of machine learning giving their IDs).
Most of problems I have had with my trials to introduce a real
interoperability (namely due to experiments with interlinking time series,
spectra and images using SAMP) was the censorship of clients applied on
particular votables... So it required introducing hacks to allow e.g. the
SPLAT to accept votables from outside and plot them or to force Aladin to
plot the coordinates in a votable received by SAMP (down to 2014).
I have never used in my extensive work with SAMP the Broadcast function
and I have doubts that there is a really working real use case (I mean
useful for science analysis). The reason to have it working is the main
driver behind Markus's use case 2) of Datalink semantics - the SAMP case.
In practice the SAMP hub decides for you whether to send particular
votable to given client - so the sending application must use particular
message (e.g. load spectrum). The destination client declares it can
accept spectra.
Perfect idea. But in practice the scientist will use the SAMP to transfer
data from one particular application to another particular target one. So
he builds the pipeline chaining the given applications by SAMP.
I would guess that I am not a typical SAMP user trying to do what I was
showing in Paris - spectrum together with image of object + custom code to
display local picture of spectrum somehow processed)
But even here I could not use the broadcast as certain directions and
certain tables did not get through - I had to use the funcionality of
TOPCAT *activation actions) - which in fact transform the data for
particular clients (including browser) .
It showed to be easier in SPLAT case to allow to user to decide how to
deal with SAMP received table (depends on use case) - in menu there are
switches. So now you can construct table with certain values in TOPCAT and
send this table to SPLAT where you say interpret received table as
spectrum (or time series)
If the table received is wrong (not containing the spectrum), it simply
dislay nonsense or nothing and shows bug.
So it is the user's responsibility to decide how to interpret the SAMP
received tables. and how to build the pipeline.
He has to right to do the wrong setup and he gets bugs or nonsense out ..
The whole effort behind the semantics of datalink end is the
desire to be identified before the target votable is opened .
Wouldn't it be better to have a clear dataproduct type written in the
VOTABLE itself - so once the metadata of votable is read, the client can
decide whether he knows how to interpret the content ?
It would require some more communication - but isn't the votable designed
in serialization that allows to read just "preamble" while the contents is
still flowing ....?
Everyone publishing some data table in VO knows well what is its contents
- so why not to describe individual tables semantically here .... (I am
image, I am spectrum ....)
So if I want attach link to other datasets I do not describe their nature.
Just when it is used the header of table is loaded and client says - I
will not deal with this (an ideal case will show window telling - Sorry
the table you want to load is a IMAGE - I do not handle images ...)
I can then arbitrarily say - what you want to download is XXXX (e.g. power
spectrum) - I am not able to handle it.
But the user may switch on button which will try to display it anyway
Then a new specialized application (period analyzer) can handle power
spectrum - so it will not complain and display power spectrum properly.
(but it decides just after reading the preambl eof votable at end of link)
Of curse this is not a optimal solution - but just I want to show the
practical side of working with VO .... To let the user bear the
responsibility what happens. He may find a nice tricky usage doing
something unconcievable during design phase.
---------------------------------
Concerning the timeseries-of-someproduct
There are two sort of time series as said:
Either it has its own data model and is propely described by dataproduct
type - like e.g. lightcurve in 2 column table or wrapped by spectra data
model - so it is loaded in whole
and the client needs to understand its content,
Or it is a simple set of other products (images, spectra) which have
associated some variable (called time) - which may be anything but having
the important property - to be ordered in a increasing or decreasing
sequence.
Than the client should be able to allow to select which variable (and
in which direction) it will be shown.
So instead of saying - plot me stacked spectra in increasing order of
variable HJD or JD (or ISO timestamp) you can say increasing order of
circular phase (usually implied). For example look at
https://wiki.ivoa.net/internal/IVOA/InterOpOct2008DAL/stelSSAcutout.pdf
slide 3 - image left is spectral series folded by circular phase
corresponding to the given period) - from 0 to 2 (just cutout of certain
spectral line) - this is common in asteroseismolgy.
Image right - order by some difference in time from some reference date.
In case of images you can say - make the animation - frames ordered by
that variable.
How to work with timeseries of datacubes I do not have idea (except if you
apply some slicing/cutouts and you get series of spectra or images ...)
Sorry for detour from the practical discussion how to name the proper
target links in DL table ....
Petr
*************************************************************************
* Petr Skoda Phone : +420-323-649201, ext. 361 *
* Stellar Department +420-323-620361 *
* Astronomical Institute CAS Fax : +420-323-620250 *
* 251 65 Ondrejov e-mail: skoda at sunstel.asu.cas.cz *
* Czech Republic skoda at asu.cas.cz *
*************************************************************************
More information about the dal
mailing list