-Moving forwards <--Re: about #calibration (VEP-006) : ----> IMPORTANT for DataLInk EXTENDED USAGE

BONNAREL FRANCOIS francois.bonnarel at astro.unistra.fr
Thu Oct 14 10:47:48 CEST 2021


Hi Pat, all
Le 13/10/2021 à 20:26, Patrick Dowler a écrit :

>
> I agree with the intent of VEP-006 that #calibration and it's child 
> properties should cover the "use this raw data" use case
> and that the current ambiguity will become an issue of not resolved.

As you know I disagree that the previous definition was ambiguous (tense 
was Ok). But anyway we didn't have a working implementation yet in the 
VO landscape as far as we know.

So as I told yesterday during TCG I do not oppose to adoption of VEP-006 
to tackle this "applicable" use case.

>
> I understand the "already applied" use case and that some providers 
> will want to provide links with the product. If that use case is 
> "debugging" or "QA" by a human, then the existing #auxiliary and a 
> decent description should suffice to support that. Providers can start 
> there and see what shortcomings they run into... I completely agree 
> with Markus that when someone actually tries to do this that we'll 
> find out whether the use case really is QA or something else and we 
> can't solve it until we find out what something else is. So, is
>
> #auxiliary description="flat field used to calibrate this image"
>
> sufficient to prototype the other use case(s)?
>
I don't think so. With my provenance co-author hat and my radioastronomy 
interests and examples I don't think it's a good idea to mix all those 
things in auxiliray. at  very first argument, there is a possibility of 
selection of links using the semantics terms and their hierarchy which 
we will lose by doing that.

As I said yesterday I will propose a new VEP for 
calibration-already-applied and this will be an opportunity to discuss 
these things further. And I'm sure an implementation will come soon. We 
can create a directory on dal-use-cases git repository to push these use 
cases.

By the way #progenitor definition-fix has to be further discussed. this 
is VEP-009.

Cheers

François


> --
> Patrick Dowler
> Canadian Astronomy Data Centre
> Victoria, BC, Canada
>
>
> On Wed, 13 Oct 2021 at 08:48, BONNAREL FRANCOIS 
> <francois.bonnarel at astro.unistra.fr 
> <mailto:francois.bonnarel at astro.unistra.fr>> wrote:
>
>     Hi all, Markus
>     As I said, I am ready to let VEP-006 go because it's trying to
>     solve a
>     real use case if nobody's reluctant to it in the TCG.
>     But I am confident that the other use case (applied) will get out
>     very
>     soon, if its not already somewhere in some DataLink response and not
>     noticed by our "radars"
>     (there is no possibility to check which terms are used in DataLink
>     services or capabilities at the registry level)
>     So I will have warned : by not trying to find a consistent and global
>     solution now we may encounter problems in a near future
>     But However I cannot keep silent when I read some inconsistences
>     in the
>     rationale. see below
>     Le 11/10/2021 à 14:19, Markus Demleitner a écrit :
>     > François,
>     >
>     > On Mon, Oct 11, 2021 at 09:58:42AM +0200, BONNAREL FRANCOIS wrote:
>     >>> On Fri, Oct 08, 2021 at 07:06:31PM +0200, BONNAREL FRANCOIS wrote:
>     >>>> Le 07/10/2021 à 15:24, Markus Demleitner a écrit :
>     >>>>> Based on this, could you then explain as clearly and
>     concisely as you
>     >>>>> can why VEP-006 impedes that use case?
>     >>>> A user discovers a calibrated image (HST, ESO, etc...) . With
>     DataLink
>     >>>> (#this or #preview) she has a look to the image and want to
>     see how the
>     >>>> uncalibrated data and the flat field looked like to
>     understand some of the
>     >>>> features. DataLink provides  a link to the #progenitor and
>     also (by some
>     >>>> record the semantics of which cannot be anymore "calibration
>     or #flat) to
>     >>>> the flat field, etc... used to calibrate this progenitor.
>     >>> ...but for this use case there is no need to distinguish
>     between what
>     >>> you call a progenitor (i.e., non-calibration part of
>     provenance) and
>     >>> calibration files applied.  Right?
>     >> Of course it's needed to make this distinction. Even to obtain
>     the right
>     >> caption for the display.
>     > A datalink client will obviously take the caption from datalink's
>     > description field, no?  I frankly cannot see what role the semantics
>     > field could have in this.
>     >
>     > What else are you thinking of?  As I said, it helps to use the "A
>     > user wants... the computer does... using..." template when stating
>     > such things so other people can follow.
>     >
>     >> Not to speak about possible reprocessing
>     > I think we all agree that datalink metadata is *far* too weak to
>     > support this; I suspect even full provenance will not usually let a
>     > computer work out a reprocessing chain by itself.  You know,
>     workflow
>     > engines are the complex beasts they are for a reason. So, datalink
>     > may help selecting artefacts mentioned in a provenance instance, but
>     > for that, a "Part-of-Provenance" concept is enough. Agreed?
>
>     About the two answers above
>
>     If the "description" display is enough for "applied" why is it not
>     the
>     case for "applicable" (VEP-006 definition for #calibration) ?
>
>     What should the client do more in the case of applicable than for
>
>     You have anyway no idea in general what kind of process you will
>     really
>     apply with these calibration data  ?
>
>     You write that reprocessing is too complex for datalink in the
>     case of
>     "already applied" but I imagine it's excatly the same for applicable.
>
>     So if "description" are enough why don't we follow Laurent, Ada
>     and many
>     other before to relax the defintion and allow both use cases ?
>
>     It is not my prefered solution as you know but it would be more
>     consistent with what you are writing there.
>
>     Cheers
>
>     François
>
>     >>> Plus: A client can already do that, no?  If you think not: What do
>     >>> you see missing?
>     >> What would be the semantics term able to drive that? Progenitor
>     alone  is
>     >> not : this, at least, as been discussed extensively (see below
>     references)
>     > I do not see why it would not be.  "A user wants to debug a data
>     > product.  The computer takes all #progenitor links and displays them
>     > together with their descriptions, offering to download them for
>     > inspection or possible use with a full description of the
>     provenance."
>     >
>     >>>> Client software is intended to display all these images
>     (science and
>     >>>> calibration) together for checking and comparison. Moreover
>     an advanced
>     >>>> version could poropose some kind of reprocessing of progenitor.
>     >>> Not that that has any relationship to VEP-006 at all, but we have
>     >>> provenance for a detailed description of how the various pieces of
>     >>> the provenance chain play together; we certainly do not want to
>     >>> re-model that in the datalink vocabulary.  It's been compicated
>     >>> enough to do that modelling once.
>     >> Of course it is a very interesting use case of DataLink to
>     provide a link
>     >> towards a full (or last step) ivoa provenance record.
>     > Yes.  But that doesn't mean we have to re-build provenance in
>     > datalink.  On the contrary: we can have a nice, clean separation of
>     > concerns, where datalink says how to get things and provenance says
>     > how they fit together.
>     >
>     >> What #calibration-applied provides is a kind of "poor-lady"
>     provenance
>     >> which only links used datasets without any insight on the
>     activity and
>     >> agents involved
>     >>
>     >> DataLink in itself has a poor but efficient way to characterize
>     relationship
>     >> between #this item and the target of the link
>     > Yes -- it's enough to filter links, which is what we want in
>     datalink
>     > semantics.  And VEP-006 plus the current state does exactly this.
>     >
>     >>> Second, the current #progenitor is clear that if there were any
>     >>> "Calibration applied" links, they would be covered by its
>     concept; see
>     >>> its description: "data resources that were used to create this
>     >>> dataset (e.g. input raw data)".  You may not like the concept
>     or its
>     >>> label, but we have VEP-009 to discuss that.
>     >> Let's go back to VEP-009
>     > Sure, but can we *please* do that outside of the VEP-006 discussion?
>     > I'm very sure we're not yet proficient enough in this kind of
>     > discussion that we can have multiple of them at the same time.  And
>     > I think you still have not argued why VEP-006 and VEP-009 could not
>     > be treated separately, i.e., how my elaboration of how we still have
>     > all reasonable options even after accepting VEP-006.
>     >
>     >> Some references
>     >>
>     >> Paul Harrison May the 5th
>     >>
>     >> Mireille , March the 23rd
>     >>
>     >> Stephane Erard March the 25th
>     >>
>     > All these persons have been at meetings in the meantime, and (at
>     > least that's what I took away from these meetings) they were
>     > satisified that their concerns were taken into account in the
>     current
>     > form of VEP-006.
>     >
>     > Paul, Mireille, and Stéphane: If I'm misrepresenting you, please
>     > correct me.
>     >
>     >> Not to speak about the  solution Pat proposed me in a private
>     email (see my
>     >> email last monday for details). I have some, concerns about it
>     but this is
>     >> the part I fooly agree with
>     >>
>     >> Recursive usage of DataLink to provide both science data and
>     >> calibration-used data
>     >>
>     >> #progenitor link followed by #this link to get science data
>     >>
>     >> #progenitor link followed by #calibration to get calibration
>     data associated
>     >> to these rawr science data
>     > While I don't believe this belongs into a discussion of VEP-006,
>     this
>     > is one reason why I'm rather skeptical of your VEP-009: With current
>     > #progenitor, the link from the reduced to the raw datalink document
>     > would reasonably be #progenitor.  With VEP-009, that is quite
>     > certainly no longer the case (unless your definition of "science
>     > data" took a surprising turn later).  But let's discuss that with
>     > VEP-009.
>     >
>     >> The consequence of this is that #progenitor itself are science data
>     > No, in that case it would be a mix of "science data" and all
>     kinds of
>     > other things that went into the reduction.  Which, mind you, is
>     fine,
>     > and I think it's the way such things should be done. It's just not
>     > within the concept your description in VEP-009 seems to try to
>     > define.
>     >
>     >>> If you disagree on this assessment: How would VEP-006
>     influence this
>     >>> deliberation?
>     >> VEP-006 is not proposing new terms it's changing the definition
>     of old terms
>     >> in a sense that calibration-applied is now forbidden.
>     > Well, the concept "pre-VEP-006-#calibration minus
>     > post-VEP-006-#calibration" apparently is not populated in the
>     current
>     > VO, and as I argued in in two mails back, when members of that
>     > concept come around, all the options are still around whether or not
>     > we accept VEP-006; so, again, I don't see why we can't at least get
>     > VEP-006 off the table.
>     >
>     >
>     >>> If not, François, can you at least agree to: "I think VEP-006 is
>     >>> wrong, but I'll not veto it"?
>     >> Exactly this : if nobody interested I give up. But I think we
>     will encounter
>     >> consistency issues in a near future if we don't discuss the
>     consequences of
>     >> this major change of definition for #calibration.
>     > Can you speculate what consistency issues you expect to see? 
>     Because
>     > you see, the only reason VEP-006 is there is that without it there's
>     > the problem that current #progenitor and #calibration have a
>     nonempty
>     > intersection and a nonempty difference, which is really bad in a
>     > formal vocabulary of this sort.
>     >
>     > Now, since the only point of the VEP is to fix an inconsistency, it
>     > would defeat the purpose if new ones came up.
>     >
>     >
>     > Anyway, now that Fançoise has chimed in -- what do we do?
>     >
>     > For me, it would still be helpful to see what problem you François,
>     > are trying to solve or you, Françoise, see as well.  And please try
>     > to be as concrete as possible and to limit things to VEP-006. 
>     And if
>     > you feel that is *reayll* impossible, at least make a strong and
>     > reproducible case for why we have to solve the question of the
>     > "Part-of-Provenance" concept together with it.
>     >
>     >                -- Markus
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20211014/c2eb46ec/attachment-0001.html>


More information about the dal mailing list