<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi all, <br>
</p>
<p>I have read again the various inputs along this thread.<br>
I would like to summarize a bit and propose a way to harmonize the
various <br>
existing sets of terms into the new vocabulary and help for
backward compatibility.</p>
<p>Sorry if this is a bit long. Just to recap. <br>
I understand there are various uses cases:<br>
</p>
<p><b>1- Data discovery</b><br>
The ObsCore specification defines terms for the type of data
products resulting<br>
from one or multiple observations.<br>
Its main focus is data discovery across multiple archives and data
centers.<br>
It is mainly used with the TAP protocol .<br>
Terms defined in Obscore1.1 are:<br>
image, cube, spectrum, sed, timeseries, visibility, event or
measurements.<br>
<br>
The term 'catalog' was discarded, because ObsCore was not
targeting at<br>
discovering all sky catalogs.<br>
Catalog services like Vizier, ESO, Mast, etc. serve source
catalogs as tables<br>
with a large variety of features that ObsCore is not describing in
its metadata<br>
profile.<br>
The term 'measurements' was selected to represent any measurements
derived<br>
from an observed image or cube by some processing, and namely a
list of<br>
extracted source.<br>
Somehow it corresponds to a source catalog restricted to the field
of view of<br>
some observed field on the sky.<br>
<br>
Planetary data has a wider set of dataproducts to discover, and
can gather<br>
different types of products within a 'granule' as far as I
understand , <br>
therefore EPN-Core has defined a longer list of terms, together
with some
<br>
concatenation mechanism.<br>
<br>
<b>2- </b><b>Specifying dataproduct types managed by services</b><br>
Registry entries for DAL services need to expose the type of data
products<br>
used or generated by a service.<br>
One example of service described in Registry is SPLAT, a VO tool
which can<br>
visualize curves as one or several functions from a 1D Vector
holding<br>
time or spectral coordinates: freq, wavelength, energy.<br>
For this Obscore dataproduct_type labels can be reused: <br>
sed, spectrum, timeseries.<br>
<br>
A registry entry (service) dealing with catalogs (all sky)<br>
needs the "catalog" term.<br>
<br>
A service generating the Hips data structure for an all sky
catalog<br>
should mention the output's dataproduct_type : catalog , <br>
and the data structure: HiPS.<br>
<br>
<b>3- Designate the type of dataset pointed via a datalink</b><br>
The nature of the data associated with a datalink entry :<br>
what is at the end of the link.<br>
For Instance:<br>
Muse IFU datacube --> datalink / derived --> source list<br>
dataproduct_type=measurements<br>
--> datalink / derived --> detection
probability map<br>
dataproduct_type=image<br>
<br>
<b>4- Use the dataproduct type for building and sending a SAMP
message to an</b><br>
<b>appropriate VO tool</b><br>
S. Erard message mentions this for EPNTAP.<br>
When Aladin, overplots sources from a Tap query on top of an
image, the query<br>
response comes back as a table with some columns as datalink
items.<br>
Here also the dataproduct type is needed to select the appropriate
API and<br>
send a SAMP message with the accessurl for visualisation or
further analysis.<br>
<br>
In order to fulfill the 4 derived requirements and take into
account existing usage ,<br>
I propose this suggestion :<br>
lets define<br>
- <i>'source list'</i> for gathering sources extracted from one
or a small set of<br>
observations, like multiband images restricted to a region, a
radio cube, an<br>
event list, etc.<br>
- <i>'catalog'</i> for allsky or multi regions source lists<br>
This is already the term used at CADC, as Pat mentionned.<br>
We can derive these terms from 'measurements' in ObsCore for
compatibility.</p>
<p>The various vocabularies can then be organised as :<br>
- <i>IVOA dataproduct_type Vocabulary</i> extends <i>Obscore
dataproduct_type</i> definitions.<br>
- <i>EPNCore</i> reuses <i>IVOA dataproduct_type Vocabulary </i>concepts
with its own labels,<br>
and its own concatenation rules. <br>
</p>
<p>I think this would help if the IVOA dataproduct_type Vocabulary
had an extra attribute to mention<br>
if it is the Obscore original terms or a derived one or a new one.
<br>
Just to make dependencies explicit when we consider new terms for
addition later in the future.</p>
<p>Thanks for your reading, and your further comments . <br>
</p>
<p>Cheers, Mireille.<br>
</p>
<div class="moz-cite-prefix">Le 18/03/2020 à 08:42, Markus
Demleitner a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:20200318074229.wmqqx7qxixdpgxax@victor">
<pre class="moz-quote-pre" wrap="">Dear François,
On Tue, Mar 17, 2020 at 09:41:42PM +0100, François Bonnarel wrote:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap=""> ObsCore/ObsTAP is for discovery of datasets which have time, spatial,
spectral and polarisation axes. Selection on the ObsCore parameters is not
sufficient for catalogs with plenty of other parameters which are directly
queried via TAP (or even in the registry). So we had to find another word
for these specific tables extracted from the datasets in order to not let
beleive that ObsCore is a discovery model for any kind of catalog. Hence
"measurements"
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">That reconfirms that the actual question is: What kind of catalog (or
whatever) would you include into the concept "Measurement", and what
would be out?
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap=""> So I think we should keep "measurements" but not with the negative
definition "Generic tabular data not fitting any of the other terms.
Because
of its lack of specificity, this term should generally be avoided, and
new, more precise terms should be introduced instead" any of the others will
fit I think and yes we have to keep ascendant compatibility with obsCore.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">We'll still need to have a definition; while the term is just a label
and doesn't really matter, the definition delineates the concept, and
while it can later be adjusted to better fit the actual use, it needs
to say what entity is and, in particular, is not part of the concept.
Hence "Generic tabular data" is not a good definition, in particular
because at least spectra, time series, and events arguably are
tabular data and thus ought to be children of measurements. But
that, I'm sure, is not useful for what Measurements was introduced
for.
That's why I'm proposing this hedging language. Once we see what
people actually want to use this for (and why they want it), I'd
expect we can come up with a useful concept to slap the term
"Measurements" on.
On the other hand, if someone proposes a description that
* says something like "it's tabular data",
* keeps spectra, time series, and events out of the concept
* and still has a plausible, useful extension (i.e., there are
actually datasets that people will want to look for that are part
of the concept)
(are we agreed on something like these criteria?) I'll happily put it
in, of course.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap=""> We may imagine have "spectrochronogram" and "sed" as children
elements of spectrum. "timecube" or "spectralcube" as children from cubes.
This will be clear in the vocabulary page but ObsCore will manage
that at the same level in dataproduct_type (exactly like we already have sed
and spectrum in parallel today)
Thoughts ?
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">The nice thing is that, if Vocabularies 2 pans out the way I hope it
will, we don't have to think about that now. As clients come along
that will want this kind of distinction, we can figure out what
structure best covers their needs.
Meanwhile, the vocabulary is clear that anything with 3 or more axes
is a cube, and clients looking for data with special axes ("time
cube", "spectral cube") can use the *xel columns. Whether other uses
will make futher distinctions in the vocabulary desirable we can, I
claim, confidently leave to the future.
The problem at this point is non-image 2D data. If that's itching
someone *now*, let's have a separate thread.
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">C ) Miscelaneous.
If this vocabulary is to be used in various contexts (and indeed it
is) why do we link it to SimpleDALRegExt ? Vocabularies 2.0 is proposing to
manage vocabularies as endorsed notes. Why don't we do it this way and refer
it from SimpleDAL, ObsCore, DataLink, etc ...
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">The reason I'd like to link vocabularies to a concrete REC is that
there should be some sort of RFC for them. It's conceivable to have
this RFC as part of an EN, and that might be what it takes of,
perhaps, the UAT or the object types.
But an extra document always is a liability (who maintains it? who
should read it? what, indeed, would it say in this case?). If we
can avoid it, we should.
As to citing a vocabulary later, I'm sure you only should say "The
vocabulary \url{<a class="moz-txt-link-freetext" href="http://www.ivoa.net/rdf/product-type">http://www.ivoa.net/rdf/product-type</a>}" or, if you're
against URLs in running text "The IVOA dataproduct type
vocabulary\footnote{\url{<a class="moz-txt-link-freetext" href="http://www.ivoa.net/rdf/product-type">http://www.ivoa.net/rdf/product-type</a>}}".
Which is a good point -- I'll add that to the Voc2 WD.
-- Markus
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
--
Mireille Louys, MCF (Associate Professor)
CDS                                IPSEO, Images, Laboratoire Icube
Observatoire de Strasbourg        Telecom Physique Strasbourg
11 rue de l'Université                300, Bd Sebastien Brandt CS 10413
F- 67000-STRASBOURG                F-67412 ILLKIRCH Cedex
Tel: +33 3 68 85 24 34</pre>
</body>
</html>