VEP-009
BONNAREL FRANCOIS
francois.bonnarel at astro.unistra.fr
Wed Jul 6 18:17:55 CEST 2022
Hi all,
This one is also pendant. Sorry for this.
I let the whole march discussion below. See my new comment in the last
part !
Le 21/03/2022 à 08:51, Markus Demleitner a écrit :
> Hi,
>
> [limiting the distribution to semantics, as registry is unconcernded
> by this]
>
> On Fri, Mar 18, 2022 at 05:17:06PM +0100, BONNAREL FRANCOIS wrote:
>> That's a good summary of what we discussed last monday
>> My 2 additional cents. The use case I'm considering also come from recent
>> discussions within ESCAPE about VO integration of gamma data.
>>
>> When we consider a so called DL5 dataset (a gamma source spectrum, or a
>> gamma ray map) we know that this has been produced by complex processing of
>> event lists with the appropriate "Instrument Response function" (IRF).
>>
>> In a DataLink context could we use #progenitor for both ?
>>
>> I think not because the event list comes from the observation and will be
>> specific to these DL5 datasets.
>>
>> On the other side the IRF would be common to plenty of sources or DL5
>> datasets as far as I understand. So they require to be accessed differently.
> Hm -- why would a datalink client use different access methods
> depending on whether or not some artefact is shared between different
> observations? I mean, in all likelihood the access would be provided
> by a simple URL in either case, no? And even if different access
> modalities were necessary, how would that relate to the semantics
> column, which, at least so far, has nothing to say about access?
>
>> A smart client would have to take these two "ancestors" of our DL5 dataset
>> in a very different way.
> Which different ways? For all I can see both links would simply be
> used when users try to figure out an oddity in the reduced data
> they're seeing (the "debug use case") -- in which case they'll need
> all progenitors. In general, for all I can see nobody has yet
> brought forward a (datalink) use case where a machine could work out
> that someone needs one thing but not the other.
>
>> My suggestion for the IRF semantic term is to simply use a new term #irf
> Frankly: I shudder to think how many terms we'd end up with if we go
> to that level of detail. But the first question, as usual, is: Why
> would *a machine* need to tell IRFs apart from, say, the background
> simulations in neutrino observations?
>
>> About the "sorting out" issue of #progenitor, #calibration-applied and #irf
>> for display purposes ...
>>
>> Having different semantics terms for those different concepts would allow
>> the client to display them in different sections of the tool.
> Sure -- but why would it want to? For all I can see, displaying "all
> items I have that help you debug the data set" in one place is what
> a client actually *should* do, so I'd say making it disperse these
> items has the smell of a bug.
>
>> Descriptions (even if they are well filled) would never allow to separate
>> automatically such things.
> True. This statement perhaps is an opportunity to put my request for
> a proper case for separating "science" and "calibration" data in
> another form.
>
> You see, if you want to automate something, you have to give an
> algorithm for how to get from a source state (in this case: a
> datalink VOTable) to a desired target state (in this case: a
> presentation of the links more easily digestable for a science user).
>
> Our vocabularies let us give such an algorithm -- "anything that's
> derived from this dataset goes in bin 1, anything it's derived from
> goes to bin 2, and stuff you need to make sense of it goes to bin 3.
> Bin 1 has label suchandsuch... organise the bins in a tree..." -- in
> a declarative form. But this only works if the data providers, when
> annotating their datalink tables, basically put themselves in the
> shoes of a machine doing this classification and assign the semantics
> based on what they find the algorithm's result should be.
Exactly what I have in mind. (except that I still separate bin 2 in bin
2.1 = "observed response" and 2.2 "response function")
> Hence, we define the algorithm *for them*.
>
> What this means is that when you give or change a concept, you should
> be able to give something like an if clause that conceptually can be
> executed by a sufficiently sophisticated machine that would tell it
> which bin to put the link in. Since in the end the recipe is being
> executed by a human, a certain amount of handwaving is permissable,
> but I'm sure you cannot assume "science data" as such has an
> interoperable meaning, and hence you just have to explain what that
> is and how data providers can tell whether something is that or
> rather "non-science data".
well "science data" results into source photon properties, taking into
account the instrument response function.
If you don't like "science data" because you think the irf are also
"science data" in some sense, can we speak of "observed response" or
"sky-generated response" ?
>
> Again, I think the best way to come up with this if statement is to
> figure out exactly *why* you would want to put different sorts of
> progenitors into different bins.
Well; see above I think I answered that several times.
Cheers
François
>
> -- Markus
More information about the semantics
mailing list