<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Dear colleagues,</p>
<p>Well I have a little concern to be considered "author" of a
proposal I am not agreeing fully with.</p>
<p>Let's try to explain that and propose a compromise.</p>
<p>* I fully agree with the definition, rationale, etc... This has
been a progress to distinguish use cases where the "linked" item
has the same progenitor than the main item from use cases where
the linked item is simply a counterpart. <br>
</p>
<p>* Gaia use case is one among others use cases where providers
wanted to associated a record in a source catalog to a "dataset"
or "dataproduct".</p>
<p>* sibling however seems to say much more than having been
prepared from the same original data (for a non native english
reader all the definitions , citations, you can find etc... speak
about "brother or sister"). To my eyes it means that they also are
from the the same "type" or "species". I am reluctant to say that
a TimeSeries is the "sister" of a source record in a catalog. And
the same would be for an image or a spectrum. On the other side,
record to record, or image to image would be fine for sibling.</p>
<p>* Looking for a solution I turned to what kind of terms
provenance data model proposes for these situations. I was looking
only to the name of relationships, not to the name of rich content
classes. I attach here a simple diagram view of the model. If the
main item is seen as an entity the relationship towards its
progenitor can be done in two ways either through the activity
which generated it an used the progenitor entities or directly (a
shortcut bypassing activity) by a WasDerivedFrom relationship. The
good thing with that is that we already had "derived" as a
semantics word in DataLink. And we had the reverse "progenitor".
Hence my proposal of "coderived" which has the advantage of
letting us ignore which activity generated the two related
"products".</p>
<p>* The granularity ( ie how many steps we may have between
progenitor and products) is arbitrary and let to the choice of the
provider. <br>
</p>
<p>* In these conditions I think that sibling could be seen as a
child term of coderived. "sibling" would be a "coderived"
dataproduct of the same type than the main item. In such a way
that if you want to associate a spectrum to a record it will be
simply "coderived" while if you want to associate an image to
another image produced in the same pipeline with different
parameters it could be "sibling"</p>
<p>Cheers</p>
<p>François</p>
<p><img src="cid:part1.2DEBF4E9.C88024FF@astro.unistra.fr" alt=""></p>
<p><br>
</p>
<div class="moz-cite-prefix">Le 22/05/2020 à 17:17, Markus
Demleitner a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:20200522151746.3ibst3x4awk6b3bv@victor">
<pre class="moz-quote-pre" wrap="">Dear TCG,
After a fairly long review, here's VEP-003 (#sibling in datalink) for
your review. According to the Vocabularies in the VO 2 WD, it is up
to the TCG to approve the new term -- or to send it back for further
discussion. It would be wonderful if we could pass a decision either
way at the next meeting, so, without further ado:
Vocabulary: <a class="moz-txt-link-freetext" href="http://ivoa.net/rdf/datalink/core">http://ivoa.net/rdf/datalink/core</a>
Author: François Bonnarel, Markus Demleitner, <a class="moz-txt-link-abbreviated" href="mailto:msdemlei@ari.uni-heidelberg.de">msdemlei@ari.uni-heidelberg.de</a>
Date: 2019-12-06
Supercedes: VEP-001
New Term: sibling
Action: Addition
Label: Sibling Data
Description: Data products derived from the same progenitor as #this.
This could be a lightcure for an object catalog derived from repeated
observations, the dataset processed using a different pipeline, or the
like.
Used-in:
<a class="moz-txt-link-freetext" href="http://dc.g-vo.org/gaia/q2/tsdl/dlmeta?ID=ivo://org.gavo.dc/~?gaia/q2/199286482883072/BP">http://dc.g-vo.org/gaia/q2/tsdl/dlmeta?ID=ivo://org.gavo.dc/~?gaia/q2/199286482883072/BP</a>
This is GAVO's rendition of the Gaia DR2 epoch photometry, where
users retrieve a time series in a specific band; the time series
in the other bands are the siblings of that.
Rationale:
It is fairly common in complex pipelines that multiple data products
result from a single observation. Often, this is true even in a
single pipeline step, and hence the data products are not in a
progenitor-derivation relationship. Still, researchers will want to
know about these data products; for instance, while exploring a source
in Gaia, a quick way to access epoch photometry or the RP/BP spectra
is obviously valuable; such artefacts are not really progenitors of
the catalog entry, though. In such cases, #sibling (or perhaps one of
its future child terms) should be used.
Clients should offer #sibling links in a context of scientific
exploitation of the dataset (as opposed to, say, debugging).
Discussion:
In the discussion, it was the need for the concept as such ("other
things that were produced from the observations that led up to #this")
was not disputed, though the discussion was somewhat delayed by
an investigation of possible shortcomings in the datalink data model
(<a class="moz-txt-link-freetext" href="http://mail.ivoa.net/pipermail/dal/2019-December/008248.html">http://mail.ivoa.net/pipermail/dal/2019-December/008248.html</a>) and
whether additional cases should or should not be included in it
(<a class="moz-txt-link-freetext" href="http://mail.ivoa.net/pipermail/dal/2020-February/008262.html">http://mail.ivoa.net/pipermail/dal/2020-February/008262.html</a>).
However, the main points of contention were the choice of the term and
label ("sibling"). Objections included that astronomers might not
understand the provenance-inspired nomenclature, that a very rough
view of provenance must be adopted to actually talk about siblings
(because, really, #this and the #sibling items just share common
ancestors, not (necessarily) the parents), or that it is confusing to
define, say, a spectrum to be the sibling of a catalogue row
(<a class="moz-txt-link-freetext" href="http://mail.ivoa.net/pipermail/semantics/2020-May/002700.html">http://mail.ivoa.net/pipermail/semantics/2020-May/002700.html</a>).
Possible alternatives investigated include #see-also (which was
rejected as being too general), #co-generated (which was disliked
because the implication that the two artefacts were built at the same
time by the same processing step is even stronger than with #sibling),
and #coderived (which wide acceptance but was strongly rejected by one
party arguing it would strongly distort the meaning of "derived".
In the end, #sibling was accepted as being acceptable and in use after
a splinter discussion during the May 2020 Virtual Interop.
Thanks,
Markus
</pre>
</blockquote>
</body>
</html>