<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi Markus,<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On 2021-03 -26, at 15:30, Markus Demleitner <<a href="mailto:msdemlei@ari.uni-heidelberg.de" class="">msdemlei@ari.uni-heidelberg.de</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">Hi Paul,<br class=""><br class="">On Fri, Mar 26, 2021 at 02:21:07PM +0000, Paul Harrison wrote:<br class=""><blockquote type="cite" class="">here are my thoughts on this, and derive from my definition of of the terms.<br class=""><br class="">* Fundamentally a “calibrator” is not a progenitor, but it is a<br class="">modifier of the progenitor(s) to create the progeny<br class=""></blockquote><br class="">Hm -- that's certainly not what our current definition says:<br class=""><br class=""> data resources that were used to create this dataset (e.g. input<br class=""> raw data)<br class=""><br class=""></div></div></blockquote>….<br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class="">But in general: When considering VEPs (or assigning terms), don't<br class="">look at the term literals ("what's behind the hash"). They, in the<br class="">end, talk to the computer.<br class=""></div></div></blockquote><div><br class=""></div>Ok, but I think the slight problem is that progenitor and calibrator are already fairly precise</div><div>technical terms in english, and that’s going to drive an individual's ’natural’ interpretation quite strongly.</div><div><br class=""></div><div>I think that the sentiment that I expressed was really the same as Mireille </div><div><span style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class=""><br class=""></span></div><div><span style="background-color: rgb(255, 255, 255);" class=""><span style="white-space: pre-wrap;" class="">—8<-----</span></span></div><div><span style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class="">One more reason : </span></div><div><pre style="background-color: rgb(255, 255, 255); word-wrap: break-word; white-space: pre-wrap;" class="">#progenitor should be reserved to designate the data in transformation through various steps within a pipeline.
this applies to the data stream...
calibration, configuration, parameter sets have a distinct nature with respect to the data processing.
The two categories should not be mixed, in my view. </pre><div class=""><span style="white-space: pre-wrap; background-color: rgb(255, 255, 255);" class="">—8<</span><span style="white-space: pre-wrap;" class="">——---</span></div><div class=""><br class=""></div><div class="">However, from now on let’s just consider them “classifier terms” and not worry about their exact definition in natural english for the rest of the email.</div><div class=""><br class=""></div></div><div><blockquote type="cite" class=""><div class=""><div class="">So, you're basically proposing in addition to the four options at the<br class="">foot of<br class=""><br class=""> <a href="http://mail.ivoa.net/pipermail/semantics/2021-March/002778.html" class="">http://mail.ivoa.net/pipermail/semantics/2021-March/002778.html</a><br class=""><br class="">to have <br class=""><br class="">(e) declare that #progenitor and #calibration are disjunct by<br class="">changing #progenitor's definition to "it's only... hm... 'science<br class="">data'", and any sort of calibration data is outside of the provenance<br class="">chain.<br class=""><br class=""></div></div></blockquote><div><br class=""></div><div>No - I would go for a modification of your option b) and add another child of #progenitor, perhaps #antecedent - though in natural english</div><div>I think that they are virtually exact synonyms - that expresses that the file is a direct “less processed data” #progenitor in the sense of my distinction that #calibration is a modifier of rather than a “direct ancestor”, so that #calibration and #antecedent are disjunct.</div><div><br class=""></div><div>I think that is a fairly ‘backwards compatible’ change. </div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div class=""><blockquote type="cite" class="">* the distinction between raw and calibrated data is made with<br class="">another piece of metadata.<br class=""></blockquote><br class="">I, frankly, don't think that distinction is even possible, as one<br class="">analysis' calibrated data is the next analysis' raw data (if you, for<br class="">the sake of this argument, identify "raw" with "earlier in the<br class="">processing chain" and "calibrated" with "later in the processing<br class="">chain").<br class=""></div></div></blockquote><div><br class=""></div>well agreed, but I think that the part of VEP6 wording that prompted me into this assertion is the suggestion</div><div><br class=""></div><div>-8<---</div><div><pre style="word-wrap: break-word; white-space: pre-wrap;" class=""> This VEP tries to make it clear that the
"has been used" interpretation is for #progenitor, wheras #calibration
is for "can be used".</pre><div class="">-8<—</div><div class=""><br class=""></div><div class="">I think it is better that the link tagging reflects the “kind” of the data resource rather than its state, and it is the mixing of the</div><div class="">two that is causing problems. </div><div class=""><br class=""></div></div><div><br class=""><blockquote type="cite" class=""><div class=""><div class=""><br class=""><blockquote type="cite" class="">I think that this is sufficient, because, as has been pointed out<br class="">earlier in this thread, in order to really redo calibration with<br class="">most modern instrumentation would require more detailed provenance<br class="">anyway.<br class=""></blockquote><br class="">But that is not what the use case is here. The use case is to<br class="">filter out rows you're not interested in in various situations (I've<br class="">called the situations "use" vs. "debug", and I'm still convinced<br class="">these *are* different situations).<br class=""><br class=""></div></div></blockquote><br class=""></div><div>I still think that the basic classification that you want in this use case is between the #antecedents and the</div><div>#calibration - the data provider could be offering alternative #calibration that was not actually used to create #this and whether it was actually used needs some orthogonal metadata.</div><div><br class=""></div><div>Of course it is possible to have #alternative-calibration as a tag, but then you run into unwanted repetition of child categories #alternative-flat etc. which makes it an ugly way to go.</div><div><br class=""></div><div>I will write this up into an alternative VEP</div><div><br class=""></div><div>Cheers,</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>Paul.</div><div><br class=""></div><br class=""></body></html>