about #calibration (VEP-006) : ----> IMPORTANT for DataLInk EXTENDED USAGE
BONNAREL FRANCOIS
francois.bonnarel at astro.unistra.fr
Mon Oct 4 16:24:21 CEST 2021
Markus, all,
I widen the audience of this discussion to DAL mailing list because I
think it's really important for an extension of the usage of DataLink.
Rare have bee people really discussing this VEP. And I don't think all
those who did supported Markus point of view. So really DataLink
implementors and users have to participate. There is surely a way to
make changes in the "semantics" vocabulary which will encompass all
points of view. To achieve that we have to widen the perspective
The "semantics" FIELD in the DataLink response qualifies the
relationship between the item identified by the ID value whatever can be
the way this item has been discovered and the target of the link.
It is intended to help (DataLink client) software to make some actions.
General statement : I think IVOA vocabularies are not lists of isolated
terms.
And this is not only the case for ucds with their standardized rules of
writing and combinations but also for simple IVOA list of terms .
And inside simpler vocabularies like the DataLink "semantics" one, it's
not only the case inside a tree but also in between trees. Terms have
to be consistent and changes should not break anything without proposing
a solution.
Vocabularies behave like "systems" with internal relationships and
interactions. Similarities and differences between use cases have to be
taken into account.
That's why I am paying attention in semantics and specially for DataLink
vocabularies, also because I think this has impact on radioastronomy
services (for which I have a peculiar attention)
I will answer some of Markus points below and after that I will extend
the discussion.
Despite what you wrote me in a very long and very hard private email I
don't want to block anything, Markus, I just want to widen the
discussion to all people who should have a look.
Le 17/09/2021 à 11:29, Markus Demleitner a écrit :
> François,
>
> On Thu, Sep 16, 2021 at 12:07:29PM +0200, BONNAREL FRANCOIS wrote:
>> Le 15/09/2021 à 16:50, Markus Demleitner a écrit :
>> In VEP-006 the new definition moves from "use case A" to "use case B"
>> (calibration stuff we want to apply to #this) and let "use case A" orphan !!
> Perhaps, but that's easily solved once it actually turns up: We'll
> just add another term (in case there are good reasons that
> re-labeling #progenitor as "Part-of-Provenance" won't work, that is:
> otherwise we don't even need a new term).
>
> But for now, nobody wants to publish such datalinks, and so there's
> really no reason to delay VEP-006 because of this concern, and anyway
> it's largely unrelated.
#calibration exists since 2014 and apparently nobody tried or succeeded
to implement it before you implemented it in gavo one year ago, with the
meaning "calibration applicable"
Where I differ from you point of view is that we have to let open the
use case where #calibration is "already applied"
I think many many services only publish calibrated data in their ObsTAP,
SIA, or other DAL services, that's the most usual use case. Give access
to progenitor and calibration stuff is interesting for at least two
reasons : quality checking, and reprocessing with same material/other
software.
In that case we are not facing calibration applicable, but calibration
applied
The fact that nobody used it YET doesn't mean we don't have to take it
into consideration with the same attention we pay to "applicable".
Objectively the two use cases exist and I know several groups in radio
astronomy who are considering to give access to progenitors and related
matters.
>
>> So my proposal to modify VEP-006 and tackle both use cases. Can we combine
>> terms in the semantics field ?
>>
>> Can we have a single #calibration branch for calibration stuff and combine
>> it with a relationship term like "#applied", #applicable ?
> As explained several times before: No. We simply cannot have a
> concept that is partly Part-of-Provenance (a concept I insist is
> useful) and partly not: That would simply break the semantics of
> rdfs:subPropertyOf.
>
> And as usual there's nothing as practical as a good theory, as what
> you'd do then...
Yes, this would probably have this consequence if we want to add an head
term to calibration. But up to now calibration was itself a head term
whatever meaning we gave to it.
>> Instead of having #calibration_applicable and #calibration_applied (and
>> children) as terms to check in the vocabulary list for the client, we would
>> have #calibration;#applied and #calibration;#applicable. And there the client
>> has to check a combination of two terms available in the vocabulary list.
> ...will immediately blow the tree structure that's really the only
> actualy application for the semantics field that we have at this
> point (at least as far as I can see).
>
> Very concretely: Where would your #applicable sit in the trees I'm
> showing in the datalinks at
> <http://dc.g-vo.org/static/datalinks.shtml>?
Well. ucds work like this and they still build trees. the main term
would still be calibration and would be the only one used for tree building.
But as you will see below I propose now another solution
>
> You'd be breaking the main use case to annotate links that in 10
> years of datalink nobody has found reason to create -- that's
> definitely not a good deal.
Well 2014-2021 is 7 years not ten
And : from 2014 to 2020 the other kind of link (calibration applicable)
was apparently not implemented either
And I don't know other implementations.
THis doesn't prove these TWO uses cases wil not be important in the
future. Just that people have priorities and cannot build everything at
the same time
>
>
>> Is that something that developers of clients could admit ?
> That's, by the way, mainly a DAL (and perhaps Apps) thing, and it
> hasn't found traction there either when it was proposed. For good
> reasons: As I'm arguing above, it's totally unclear what the
> semantics of a semantics column used in such a way would be. How
> would clients use such annotation?
>
>
> Sigh... I know I'm sounding like a broken record, but: Let's solve
> the problems we actually have *now*. Try to build some grand
> description of the works, try to solve many problems at once, and
> we'll never get anywhere.
That's not MANY problems. #Calibration had and still has tow possible
meanings for astronomers and hence two possible behaviors for client
software.
How do we solve that ?
>
> So:
>
> (a) What problem we *actually have* with #calibration and children (in
> *existing* datalink documents) is *not* solved with VEP-006?
>
> (b) Is there some *computer* operation that was previously possible
> that is made impossible by VEP-006?
>
> If the answers to (a) and (b) are None (or "None, but I have all these
> other ideas that we could also discuss at the same time"), then let's
> please just move on.
This is the point where I strongly disagree. Let's look at the
vocabulary with a wider perspective.
>
> You know, this is really a minor change for a term we likely
> wouldn't even have if we hadn't just taken semantics out of thin air
> when we started datalink (and instead had added them as we went).
>
> And we've been discussing it now for more than a year (date on
> VEP-006: 2020-09-09). Granted, there have been two improvements in
> the meantime, so it wasn't all wasted time.
>
> But going back to deeply disrupting proposals one year into the
> process, proposals on top that were discussed and rejected multiple
> times in the past, obviously don't solve the problem we're trying to
> solve and on top attempt to solve a (different) problem we don't even
> have at this time is, excuse me for being blunt, frustrating.
Apart from me At least two or three people objected to VEP-006 as it is
stated now.
>
> Come on, François, give your heart a shove and just make your peace
> with VEP-006. It's sane, doesn't damage anything, and solves an actual
> problem we've inherited from the olden days.
>
> And if people really start pushing out datalinks that are
> "Calibration applied", let's quarrel on whether or not we need to fix
> anything around #progenitor. Then. Not now. VEP-006 simply is
> totally unconnected with that discussion.
>
>
> Thanks,
>
> Markus
>
> PS: Incidentally, please edit the subject lines to at least not quote
> the wrong VEP (as here, where François had VEP-007). This will help
> later when people browse the mailing list archives.
What kind of solutions can we find to solve the two use cases issue ?
Be careful that there are new things there (marked with ++ ahead)
1 ) the duplicated tree solution :
calibration (as redefined in current VEP-006) with children
Dark, flat, bias, etc.... Note the evolution I make on this by giving up
"-applicable" suffix !!!
and calibration-applied (as defined before) which children
Dark-applied, flat-applied, bias-applied, etc....
These are really two parallel sequences where we have to
duplicate any sub term of calibration (for example "photstandard" may
come). It's not very elegant and suggests there is some kind of
combination active behind the scene.
2 ) the ucd-like combination solution : as explained in my previous email
Although it could work I admit it is a major evolution of
DataLink which may have other consequences to be considered carefully
++ 3 ) the "relaxed" or fuzzy solution.
Some people suggested (in private or on the list but a long
time ago) that #calibration and children should be valid for both use
cases (calibration material applied or applicable) .
Argument for that is that DataLink should not care about the
calibration status of datasets. this is not what it was intended to do.
I'm personaly reluctant: because client software would have
to use another information to know what to do with this #calibration
material. In the most general case we don't know where the client can
pick up this information.
++ 4 ) the "two columns" solution (also after some private discussion):
calibration ( intended as applicable) and
calibration-applied are the two relationship terms. So they are the only
semantic terms
#bias, #dark, #flat (and future other calibration data)
are more terms giving the intrinsic type of these calibration data.
They are actually some special type of observations done
in specific way.
So they could be described as such using the new
"content_qualifier" column currently discussed for DataLink 1.1
So to fully describe the link we need the "semantics"
FIELD AND the "content_qualifier" one
++ 5 ) In a private discussion Pat suggested that we adopt the
definition of VEP-006 and describe in a pattern usage document the two
possible use cases.
Calibration (applicable) would be straightforward an
calibration applied would be rendered by a recursive usage of DataLink.
The #progenitor link in the calibrated data DataLink
response links itself to .... a DataLink document .....
....which further links to the progenitor itself
(#this) and #calibration data attached to it (#calibration applicable to
the progenitor !!)
To me, although it seems to work when we are sure to have
only #calibration data attached to #progenitor that have been already
applied (and are still applicable)
it may be ambiguous in some use cases
Best regards
François
More information about the dal
mailing list