[Heig] Post running meeting thoughts
BONNAREL FRANCOIS gmail
francois.bonnarel at gmail.com
Fri Mar 20 00:12:21 CET 2026
Dear Ian,
Sorry but I think there is a misunderstanding of my intentions in this
discussion. I will start by answering your conclusion before replying to
some details
> Sorry, but we need to to take full advantage of the flexibility
> provided by the ObsCore Recommendation as written to serve our science
> users our science data, based on our familiarity with how our science
> users want to access and use those data. If the IVOA is unwilling to
> support the needs of high-energy astrophysics, or at least of this
> very large HEA data provider, then I want to hear that stated directly
> and clearly by the IVOA Exec.
Two things :
First : I never, never claimed that data which are important for your
group, for Chandra to expose should not be exposed in the VO. I thought
I was clear on that several times. I only challenge the idea that
ObsCore is the right way for everything. And_I proposed _several other
solutions fully VO consistent to do that (two DataLink solutions,
multiple tables with joins) in my post 4 weeks ago on PR #35
(https://github.com/ivoa/HighEnergyObsCoreExt/pull/35). It would be good
to have you opinion about those.
Second : I am interested in this project and insist to give some
advice because I think my experience in building several VO protocols
can be useful, as well as my knowledge of the way several major data
providers such as CADC, GAVO, VizieR and others implemented them and I
hope the exec will not consider I am doing any harm in discussing the
best way to expose data.
A few more details below
Le 02/03/2026 à 23:01, Dr. Ian N. Evans a écrit :
> Dear Francois,
>
> You are of course entitled to your opinion. However, what you are
> arguing is in my view inconsistent with the wording of the ObsCore
> Recommendation, Version 1.1.
>
> As stated in section “1. Introduction”, of the Recommendation “... The
> ability to pose a single scientific query to multiple archives
> simultaneously is a fundamental use case for the Virtual Observatory.
> Providing a simple standard protocol such as the one described in
> this document increases the chances that a majority of the data
> providers in astronomy will be able to implement the protocol, thus
> allowing data discovery for almost all archived astronomical
> observations.” That is exactly what we are proposing here, with the
> scientific data products required for high-energy astrophysics.
---> I think many of the products you store in your archive (as
presented in the document) are fundamentally not so far from "flat
fields", "dark images" "psf" in optical astronomy, not to speak about
various auxiliary data in radio interferometry. I don't know any of the
main services exposing such response functions in the same service than
the main sky data. However most of them provide ways to retrieve such
response functions or auxiliary data from fundamental sky data. Even The
HESS prototype is not exposing response functions as independent
products in ObsCore. Why Would Chandra be so different that you need to
use the ivoa.ObsCore table to expose them instead of other related tables ?
>
> Further, under section “2. Use cases”, the Recommendation states
> “Support any type of science data products (image, cube, spectrum,
> time series, instrumental data, etc.).” All of our data products
> satisfy this definition (and in fact instrument responses are a
> perfect example of “instrumental data”).
--> Our understanding of "instrumental data" was sky data at a very raw
level. And again there is absolutely no practice in the VO to expose
response functions in ObsCore
>
> But you say “By science data we mean data where we can detect some
> information of interest coming from the sky.” Sorry, but YOU don’t
> get to tell US what constitutes OUR “science data”. Section “3.3.3.
> Observation and Observation Dataset” of the Recommendation states
> “exactly what comprises an “observation” is not well defined within
> astronomy and is left up to the data provider to define for their
> data.” for a reason. Science data products vary dramatically from
> waveband to waveband, and even within a waveband from instrument to
> instrument depending on the physical mechanism used by the detector.
> We consider instrument responses to be “science data” and very much
> part of the “observation dataset”.
---> Ok if you don't agree with my definition of science data which was
based on what I have seen in ObsTAP services so far, let's talk about
"sky data" instead. Again I am not arrogant enough to force you to
expose or not such and such data, but I think I can give advice on where
to put them according to experience that all the services have
followed so far. What they have done is consistent with the sentence in
the same section 3.3.3 which reads "ObsTAP only directly supports the
description of science data products, i.e., data products which contain
science data having some physical (spatial, spectral, temporal) coverage."
>
> Further down, section 3.3.3. Observation and Observation Dataset” of
> the Recommendation states “Two different approaches can be followed
> for exposing the instrumental data from an observation. One can either
> expose the individual science data products resulting from the
> observation, all sharing the same obs_id, or one can “package” the
> data products and expose the package as a single complex instrumental
> data product. ... Which approach is best depends upon the anticipated
> scientific usage and is up to the data provider to determine.” Again
> this is sensibly up to the data provider because the data provider is
> the one with the understanding of how the provider’s science users
> access and use their data.
---> the same section reads immediately after "If the data products
comprising an observation are exposed individually then attributes such
as the calibration level can vary for different data products, e.g., the
raw instrumental data as observed might be level 1, a standard pipeline
data product might be level 2, and a custom user-processed data product
subsequently published back to the archive might be level 3. All such
data products would share the same obs_id." Clearly this sentence
highlighted the intention that we were building a protocol to expose
"sky data" at whatever calibration level and not for response functions.
Because I don't understand how we can define a calibration level for a
response function !
>
> You further posit that “If we don't do this and extend the domain of
> ObsCore too much we force it to become something else and to loose
> universality.” On what basis do you make that assumption? Certainly
> for Chandra data for example, our instrument responses all map to a
> specific spatial, spectral, and temporal coverage region on the sky.
> The use cases in Appendix A of the HEA ObsCore Extension almost all
> comprise queries that are based on sky geometry, spectral, or temporal
> coverage, with a few others based on obs_id.
---> I think there are two situations :
- either response function are estimated by exposing
the instrument to experimental flux in order to provide a generic
response function valid for a set of observations (like we can do for a
dark or a flat or a spectral response). In that case the "obscore
characterisation" of this response function considered as a dataproduct
in ivoa.obscore table would simply be borrowed from the sky data we
would relate to this response function. So mapping those to a
specific spatial, spectral, and temporal coverage region on the sky is
wrong because the actual characterisation of the response dataset in its
own domain is probably very different. Appendix A examples don't go
against this interpretation of how the obscore parameters are filled.
This is why I think we are losing universality in doing this. This would
be a very divergentway of interpreting the ObsCore characterisation
parameters.
- or response function are directly estimated from
the sky data themselves via an analysis (I guess this is how psf are
generally obtained by estimating the profile of a point source in the
data). Then the response is actually part of the description of the sky
data. It's level 4 characterisation ( 1 -> location, 2 -> bounds, 3 ->
support, 4 -> functional response) It's not a different product.
But the ivoa.obscore table doesn't provide level 4 characterisation. If
desired it has to be provided by a link.
In both cases the binding to all response or auxiliary
functions can be done with the main sky dataset either by Classical
DataLink or by joining two different tables (ivoa.obscore with any kind
of description/access table for response function) as I have explained
in my post in PR #35 (https://github.com/ivoa/HighEnergyObsCoreExt/pull/35)
>
> You commented “When we designed ObsCore the intention was to design a
> data model and an associated tap table to expose science data.”, and
> that is great. However, I doubt very much that the design team
> included representation from the full range of wavebands or complete
> representation of different types of experiments, facilities, or
> missions, and as a result the inputs that went into building the
> standard (for example, what constitutes “science data”) would have
> been incomplete. You did an amazing job given the inputs that you
> had! But standards evolve with time as they become more complete, or
> they wither and die. ObsCore is currently evolving based on needs
> from radio, timing, and high-energy astrophysics, and this should be
> celebrated because it means that the standard is not withering and dying.
---> The radio extension was a way to provide a better description of
radio sky data themselves, not to add calibration data, this is already
a great evolution and the additional parameters in the HeIG extension
follow the same philosophy. That's already a great change. And for
root ObsCore itself the characterisation datamodel which is at the basis
of it and the Obscore specification itself were co-authored with people
from almost all domains including High Energy. The HESS prototype
doesn't provide access to response function apart from the event list
via the event-bundle product type. So I don't think High energy use
cases were totally ignored in this work.
>
> Sorry, but we need to to take full advantage of the flexibility
> provided by the ObsCore Recommendation as written to serve our science
> users our science data, based on our familiarity with how our science
> users want to access and use those data. If the IVOA is unwilling to
> support the needs of high-energy astrophysics, or at least of this
> very large HEA data provider, then I want to hear that stated directly
> and clearly by the IVOA Exec.
---> My final question to you : "what is so wrong with combining ObsCore
and other adapted VO Technics to expose all kind of data with more
flexibility".
I am ready to write a section highlighting how this combination can be done.
Best regards
François
>
> Thanks,
> —Ian
>
>> On Feb 24, 2026, at 08:43, BONNAREL FRANCOIS gmail via heig
>> <heig at ivoa.net> wrote:
>>
>> Dear Bruno, dear Ian, all
>>
>> We come back to this.
>>
>> There is no doubt for us that VO should provide ways to expose such
>> things as "background images" or in your case, Bruno, background rate.
>>
>> Our concern is about forcing ObsCore to be this way to expose such
>> datasets.
>>
>> When we designed ObsCore the intention was to design a data model
>> and an associated tap table to expose science data.
>>
>> By science data we mean data where we can detect some information of
>> interest coming from the sky.
>>
>> If we don't do this and extend the domain of ObsCore too much we
>> force it to become something else and to loose universality.
>>
>> So according to this general definition we don't think response
>> function belong to the ObsCore domain. Advanced data products are
>> another issue we won't discuss them today.
>>
>> Of course there are plenty of ways to expose those data and relate
>> them with science data. VO must for sure improve their description
>> and access modes
>>
>> DataLink is the minimal method to make response data accessible and
>> relate them to relevant science data but may present the drawback to
>> be a "two steps" process. If direct access to response data is
>> required in a one step process we suggest to explore the solution of
>> defining the DataLink response table as a TAP table in order to allow
>> JOINS with the ObsCore science data table.
>>
>> But it is true that the description provided by DataLink is rather
>> poor.
>>
>> So, alternativeky, when needed, different tables may be defined to
>> describe response function datasets and provide pointers to them if
>> necessary.
>>
>> A table with ucd on most of the columns (existing ucds or new ones
>> to define) would already provide a lot of interoperability between
>> services providing response data.
>>
>> Moreover, defining "response function data models" may provide
>> more flexible and accurate descriptions and acces methods. Datamodels
>> may be embedded in VOTables and mapped to columns using utypes or
>> Mango+Mivot.
>>
>> We think some sections of the HeiG note should be revised in
>> these directions.
>> We are ready to help to do that.
>> François with Mireille
>>
>>
>>
>> Le 07/02/2026 à 19:15, Bruno Khelifi via heig a écrit :
>>>
>>> Hi all,
>>>
>>> About "Background images and pixel masks are not response-function
>>> data products", maybe this is the case for X-rays. I won't discuss it.
>>>
>>> As reminder, the term `background` is very generic and can be used
>>> for everything. In gamma-ray astronomy, it is from cosmic rays (it
>>> is not broken pixels, that are handled much more earlier during the
>>> raw data processing). In the GeV, TeV, PeV, the background rate is
>>> without any doubt an IRF!
>>> In contrary to X-rays, 3D analysis are routinely made. For that the
>>> counts are compared with the predicted counts, that is the sum of
>>> the ones associated to gamma rays and the ones associated to the
>>> background rate, that are badly classified events as gamma-rays (see
>>> our notes). The estimation of the background rate can not be done on
>>> the data, because they are gamma rays everywhere in the field of
>>> view for the galactic plane (ie one can not use 'OFF' regions). As
>>> reminder, the Fermi bubble or eRosita bubble are going very up in
>>> latitudes. Also, one can not use simulations of cosmic rays to
>>> estimate the background, because the resources would be much too
>>> high and also because the simulations badly reproduce the reality
>>> (many studies made since decades show that). We use a complex
>>> pipeline that takes in input data, creates some exclusion masks
>>> iteratively in 3D, generates templates of rate in an hypercube (
>>> [X,Y] or theta, atmospheric quality observable, optical efficiency
>>> of our instruments, Zenith angles, azimuth angles between of the
>>> geomagnetic effect on the extensive air showers, and reconstructed
>>> energy), curates the data to handle empty bins and low statistics
>>> bin, interpolates this hypercube template to compute the
>>> observation-wise background rate.
>>>
>>> For the neutrino telescopes, real data are also used. A specific
>>> pipeline is of use also to compute the background rate.
>>>
>>> So, one should keep without any doubt the background rate as data
>>> product!
>>>
>>> Best,
>>> Bruno
>>>
>>>
>>> Le 04/02/2026 à 20:38, Dr. Ian N. Evans via heig a écrit :
>>>> Dear Francois,
>>>>
>>>> I consider the arf, rmf, and psf to be response-function data
>>>> products. Background images and pixel masks are not
>>>> response-function data products - they are determined directly from
>>>> the observation event list similarly to a total counts image. Bad
>>>> pixel is a region data product, but it’s something of a gray area
>>>> since it’s a combination of known bad pixel regions plus bad pixel
>>>> regions derived directly from the observation event list.
>>>>
>>>> For the Chandra Source Catalog (CSC) prototype, at least initially
>>>> we plan to expose all of the data products directly to demonstrate
>>>> that the extension provides the flexibility that we need. However
>>>> in production, we likely would not expose all of the data products
>>>> individually but rather combine some of them with the event lists
>>>> as event bundles (at least for the individual observation
>>>> full-field data product set). We would want to expose the
>>>> individual observation event lists individually, but might choose
>>>> for example to construct an event bundle that exposes (at least)
>>>> the event list, bad pixel regions, aspect histogram, and possible
>>>> aspect solution as a bundle since there is very little use for the
>>>> latter 3 types of data product without the event list.
>>>>
>>>> While tying associated and derived data products to an event list
>>>> in an event bundle seems sensible for individual observations, our
>>>> experience is that this isn’t appropriate for the CSC advanced data
>>>> products. Since CSC 2.0 was released we have had millions of
>>>> catalog data product downloads and surveyed our user base as to
>>>> data product usage.
>>>>
>>>> The typical usage patterns for the CSC advanced data products are
>>>> different from the typical usage patterns for individual X-ray
>>>> observation data.
>>>>
>>>> For the latter the user typically downloads the event list and
>>>> ancillary data products (such as responses or other data products
>>>> that can be used to build responses) as a set, and then performs
>>>> data analysis steps directly on the event list using the ancillary
>>>> data products, often after applying spatial/spectral/temporal
>>>> filters to the data. Event bundles facilitate this usage.
>>>>
>>>> For the CSC advanced data products the usage patterns are quite
>>>> different. Many (most) of these advanced data products are derived
>>>> from multiple (in some cases hundreds) observations. Typically the
>>>> users aren’t interested in performing data analysis steps on the
>>>> event lists themselves, and often aren’t interested in knowing
>>>> which observation(s) they are derived from (at least not from the
>>>> perspective of having to perform a data query). They just want
>>>> (e.g.) all the spectra (or light curves, or photometry MPDFs, or
>>>> ...) in a certain region of the sky, or in a given time range, etc.
>>>> And given the data volume that’s all they want. Maybe they’ll
>>>> come back later and ask for a subset of additional data products
>>>> after they’ve performed some preliminary analyses on those data
>>>> products, but they don’t want those up front.
>>>>
>>>> Based on these usage patterns, I think we will likely want to
>>>> expose the remaining CSC data products individually.
>>>>
>>>> Thanks,
>>>> —Ian
>>>>
>>>>> On Jan 27, 2026, at 09:52, BONNAREL FRANCOIS gmail via heig
>>>>> <heig at ivoa.net> wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> After the meeting last week, I was still thinking about what the
>>>>> Chandra prototype could look like
>>>>>
>>>>> For the Paris HESS prototype, I get the idea since a couple of
>>>>> years now.
>>>>>
>>>>> Trying to understand what the CSC data products could be I came
>>>>> back to Ian's Malta interop presentation.
>>>>>
>>>>> I copy/paste here one of the slides where some of these products
>>>>> are described.
>>>>>
>>>>> Before trying to define dataproduct_type vocabulary terms for
>>>>> those products I am wondering if we really need to expose all this
>>>>> data directly in
>>>>>
>>>>> an ObsTAP service.
>>>>>
>>>>> For example background images, psf, pixel mask, bad pixel regions,
>>>>> ARF belong to the "response functions" category if I'm not mistaking.
>>>>>
>>>>> They probably are attached to a photon event list or an image or ....
>>>>>
>>>>> Including all this in the main ObsCore table will overload it very
>>>>> heterogeneously. Some of these response functions will be similar
>>>>> to what we get in other domains (psf) some will be very different
>>>>> and specific to Xray.
>>>>>
>>>>> I understood that the spatial, spectral, time characterization of
>>>>> these specific products could be borrowed from the observation
>>>>> they are associated with. It's ok but is that useful ?
>>>>>
>>>>> For accessing these response functions I can imagine 4 solutions
>>>>> which all will have the advantage to let the OBsTAp service be
>>>>> focused on measurements obtained from the sky at whatever calib level.
>>>>>
>>>>> 1 ) the photon event list and response functions are gathered
>>>>> together in the same tar or archive file (or MEF) which is typed
>>>>> as an event-bundle. Direct access to this bundle from Obstap
>>>>> access_url is then easy. It's the client task to figure out what
>>>>> to do with the content of the bundle.
>>>>>
>>>>> 2 ) the various response material is kept as a set of
>>>>> individual products. All are associated to an event list or an
>>>>> image or a spectrum. In that case ObsTAP point to a datalink
>>>>> response which lists all these different products. The semantics
>>>>> FIELD writes calibration or response function. Content_qalifier
>>>>> FIELD writes the very nature of the product.
>>>>>
>>>>> 3 ) the DataLink reponse content may be organized as a TAP
>>>>> table. It's then possible to query at the same time the ObsTAP
>>>>> table and the DataLink-like table by a join on
>>>>> ObsCore/obs_publisher_did-DataLink/ID
>>>>>
>>>>> 4 ) if we need a more detailed description of the response
>>>>> products to help discover and select them we could imagine
>>>>> creating a specific "response product" table following a specific
>>>>> datamodel as proposed by Mireille in her Gorlitz presentation.
>>>>> This will allow to attach specific eg :
>>>>>
>>>>> - time range to a psf or
>>>>>
>>>>> - specific release date and description to an arf or a bad
>>>>> pixel map
>>>>>
>>>>> -....
>>>>>
>>>>> Natural join on obs_publisher_did in both tables will allow
>>>>> to query those table at the same time with selection criteria from
>>>>> both.
>>>>>
>>>>> Cheers
>>>>>
>>>>> François
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <ulnvpjC4zXtPT0zn.png>
>>>>>
>>>>> --
>>>>> heig mailing list
>>>>> heig at ivoa.net
>>>>> https://www.google.com/url?q=http://mail.ivoa.net/mailman/listinfo/heig&source=gmail-imap&ust=1770130364000000&usg=AOvVaw1NuTRL3a6Ib8NM2g8f0TM8
>>>>
>>>> —
>>>> Dr. Ian Evans
>>>> *Astrophysicist*
>>>> *Chandra X-ray Center*
>>>> Center for Astrophysics | Harvard & Smithsonian
>>>> Office: (617) 496 7846 | Cell: (617) 699 5152
>>>> 60 Garden Street | MS 81 | Cambridge, MA 02138
>>>>
>>>> <PastedGraphic-2.png>
>>>>
>>>> <PastedGraphic-3.png> _
>>>>
>>>> <https://www.google.com/url?q=http://cfa.harvard.edu/&source=gmail-imap&ust=1772545426000000&usg=AOvVaw0uFI_1KoCUvDnmcfLwnZWl>__cfa.harvard.edu
>>>> <https://www.google.com/url?q=http://cfa.harvard.edu/&source=gmail-imap&ust=1772545426000000&usg=AOvVaw0uFI_1KoCUvDnmcfLwnZWl>_ |
>>>> _Facebook
>>>> <https://www.google.com/url?q=http://cfa.harvard.edu/facebook&source=gmail-imap&ust=1772545426000000&usg=AOvVaw02xqYrC2mM2M8D3GD_fAAy>_ |
>>>> _Twitter
>>>> <https://www.google.com/url?q=http://cfa.harvard.edu/twitter&source=gmail-imap&ust=1772545426000000&usg=AOvVaw3ilzQjksdV2EyBorR2VpR3>_ |
>>>> _YouTube
>>>> <https://www.google.com/url?q=http://cfa.harvard.edu/youtube&source=gmail-imap&ust=1772545426000000&usg=AOvVaw39-gxMDL8maWEsAwabab0W>_ |
>>>> _Newsletter
>>>> <https://www.google.com/url?q=http://cfa.harvard.edu/newsletter&source=gmail-imap&ust=1772545426000000&usg=AOvVaw1GftJaRGdajEXnp9-teyn8>_
>>>>
>>>>
>>>>
>>> --
>>>
>>> Bruno Khelifi
>>> Physicist at CNRS (laboratory APC, Paris)
>>> Phone: +33.1.57.27.61.58 - Fax: +33.1.57.27.60.71
>>> APC, IN2P3/CNRS - Universite de Paris Cite
>>>
>>
>> --
>> heig mailing list
>> heig at ivoa.net
>> https://www.google.com/url?q=http://mail.ivoa.net/mailman/listinfo/heig&source=gmail-imap&ust=1772545426000000&usg=AOvVaw1CKWt3qicSCcjAbQuwDfB1
>
> —
> Dr. Ian Evans
> *Astrophysicist*
> *Chandra X-ray Center*
> Center for Astrophysics | Harvard & Smithsonian
> Office: (617) 496 7846 | Cell: (617) 699 5152
> 60 Garden Street | MS 81 | Cambridge, MA 02138
>
> PastedGraphic-2.png
>
> PastedGraphic-3.png _
>
> <http://cfa.harvard.edu/>__cfa.harvard.edu
> <http://cfa.harvard.edu/>_ | _Facebook
> <http://cfa.harvard.edu/facebook>_ | _Twitter
> <http://cfa.harvard.edu/twitter>_ | _YouTube
> <http://cfa.harvard.edu/youtube>_ | _Newsletter
> <http://cfa.harvard.edu/newsletter>_
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/heig/attachments/20260320/627b671a/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-2.png
Type: image/png
Size: 581 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/heig/attachments/20260320/627b671a/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-3.png
Type: image/png
Size: 21717 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/heig/attachments/20260320/627b671a/attachment-0003.png>
More information about the heig
mailing list