VEP-009 [was: Re: Vocabulary construction principles]

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Oct 25 14:48:20 CEST 2021


Hi François,

On Thu, Oct 21, 2021 at 11:38:35AM +0200, BONNAREL FRANCOIS wrote:
> Le 18/10/2021 à 15:13, Markus Demleitner a écrit :
> > So we need to figure out where that reluctance comes from.  Preparing
> > for the VEP-009 discussion (but let's have VEP-007 before that), it
> > would already be useful if you could state what exactly it is you
> > don't like about #progenitor: Is it with the whole concept
> > "Part-of-Provenance" (and its pragmatics "show when debugging"), is
> > it the label "Progenitor" that itches you, or is it the identifier
> > #progenitor?
> 
> I would be strange to have an identifier different from the Label in that
> context.

Why?  Labels and identifiers are distinct for various good reasons,
and we already have some examples where they are different either by
design (the label didn't work as identifier even with reasonable
mogrification) or by history (as now with VEP-006, where the label
originally chosen just didn't work well).  Why should datalink
(assuming this is what you mean by "context") have particular trouble
with that?

Granted, datalink clients don't yet do the translation to proper
labels by and large.  But this has been really hard before VocInVO 2,
and it has become really easy with desise.  I'm therefore confident
that the unfortunate habit of hurling identifiers at humans will be a
lot less common in the future.

> The basic "pragmatics" of distinguishing progenitors (science data) from
> calibration is to allow all VO clients to sort out these things in different
> directions automatically.
> 
> Then each client (or human user) can do various things.

Hm... could you be a bit more specific on "different directions" and
"various things"?  I'm really sure we'll only find a good solution if
we have a clear idea of what we want to accomplish.  At least I don't
have that for VEP-009.

> By the way in the full IVOA provenance model the activity producing let's
> say, the exposed dataset is "using" other entities. This "usage"
> relationship has a "type" (see page 26 of the spec)
> 
> It clearly distinguishes the "Main" type from the "Calibration" type.
> 
> In dataLink we don't have activities yet and we simply bypass that activity
> to link an exposed dataset to "entities" used by an "unknown activity" to
> produce that.

Why would we even want activities in datalink when there's the
Provenance DM?  Me, I'd much rather we had a clear separation of
concerns here (as in: Anything in datalink's #progenitor is in
Provenance's concern) -- and perhaps build a ProvDM-lite if we find
the full ProvDM to be a bit too much for the casual publisher.



Anyway, what I take away from this is that you basically agree a
concept "Part-of-Provenance" should be in Datalink.

Once we're there, there are basically two ways forward:

(a) we change #progenitor's definition to not be Part-of-Provenance
any more and create a new concept #part-of-provenance plus perhaps
#calibration-applicable

or

(b) we change #progenitor's label to "Part of Provenance" (perhaps)
and then wait if there are actual use cases for distinguishing
different sorts of provenance items in datalink.

I'd say (b) we could do immediately.  For (a) -- and I'm guessing
you, François, will want that -- we'll clearly need to amend VEP-009
and add the new #part-of-provenance.  Also, the proposed definition for
#progenitor-new should give some testable criteria for whether or not
something is #progenitor-new; please review my original response,
http://mail.ivoa.net/pipermail/semantics/2021-July/002829.html, for
examples where that's not at all clear from the current definition
(GW simulations, flats entering into a superflat).

My suspicion is that all this will be a lot simpler once we actually
have some link that falls into current #progenitor but does not into
your proposed #progenitor-new -- and having such a thing would also
mollify my concerns about not having an actual used-id for what
current VEP-009 has as a rationale.

Given we have two new VEPs in the pipeline that add terms that are
actually used already: François, would you be ok with hibernating
VEP-009 until there are links where it would be relevant and
discuss VEP-010 and VEP-011 first?

           -- Markus


More information about the semantics mailing list