[Heig] Comments on HE ObsCore note

Dr. Ian N. Evans ievans at cfa.harvard.edu
Wed May 28 17:52:58 CEST 2025


Hi Mathieu,

> On May 26, 2025, at 10:45, Mathieu Servillat via heig <heig at ivoa.net> wrote:
> 
> Dear all,
> 
> There has been several additions to the HEIG ObsCore document (thnaks Ian for a lot of inputs), and there are still some parts missing that were proposed after the Paris workshop, though this was discuseed by email (datalink, vocabularies).

Agreed.  My last set of updates focused on adding a preliminary set of use cases and I didn’t try to update the rest of the document to conform to some of the discussion from the Paris workshop.


> 
> I think the document needs a refined structure (I'll post something similar as an issue on Github), e.g. :
> 
> --------
> 2 High Energy Astrophysics Data
>     2.1 Observation techniques and data specificities --> as is now, defining events, event-list and IRFs in particular
>     2.2 HE data access
>         --> this could be a summary of use cases, the orientations of the extension would be shown, e.g. the complex time intervals, the energy dependant values, the event counting impact on data description, the complex instruments used (all those are answered in some way in sections 4, or 3)
>         --> I would separate the direct search of event-lists+IRFs, and the search via a catalog of sources (more advanced and complex)

I think adding to the text here is fine, and there are definitely different use cases for scientists who want to retrieve and perform analysis of a small number of observations (typically the observers) and those who want to do large scale analysis (i.e., typically of catalog data products).  I would be careful about how you “separate direct search of event-lists+IRFs, and search via a catalog of sources”, however, because meaningful pre-canned responses can’t be constructed for many HE missions and have to be constructed by the user based on other ancillary data products.  Those ancillary data products may in some cases look a lot like catalog data products.


> 
> 3 ObsCore Attribute Definitions for High Energy Astrophysics Data
>   --> this should not be too developed, i.e. is the keyword ok ? should it be adapted ? is the current REC blocking ?
>   --> for example, it would be said here that we need to define more dataproduct_types (particularly event-list), but those would be defined later in a Vocabulary section

Here I disagree that “this should not be too developed”.  As I see it, we are proposing an extension against the current IVOA Recommendation (i.e., REC-ObsCore-v1.1-20170509), which explicitly defines a set of dataproduct_type strings in the text of section 3.3.1 (and in terms of the TAP interface in section 4.1).  Neither of these sections refer to a “vocabulary”.  Since the IVOA uses a formal process for changes, from a process perspective the correct way to propose additions to ObsCore dataproduct_type is to do so explicitly as part of the discussion of dataproduct_type.

While I agree that adding definitions in the Vocabulary section would be beneficial, from a process perspective there is no formal link between ObsCore dataproduct_type and the data product vocabulary (or from ObsCore dataproduct_subtype for that matter - and in fact we can’t differentiate between these or validate the applicability off an entry in the data product vocabulary to any specific use such as dataproduct_type).  Until these links are formally identified and established as part of the ObsCore Recommendation, simply adding definitions to the data products vocabulary is not equivalent to proposing changes to the ObsCore Recommendation.


> 
> 4 Extensions to ObsCore Specific to High Energy Astrophysics Data
>   --> the new attribute proposed
> 
> 5 DataLink for HE data
>   --> particularly suited to explore IRFs for an event-list (see Bruno's email and associated comments)
>   --> requires a product vocabulary (developped below), maybe other attributes
> 
> 6 Vocabularies
>   6.1 HE product types
>   6.2 instrument response functions (if not in product types)
>   6.3 UCD terms (for o_ucd
> --------
> 
> On the content of the note, I think we should distinguish ObsCore HE data products ant other HE data products. The first scenario that should show up is a better access to event-lists and their IRFs. However currently, there are definition for draws, pdf an regions, that correspond to analysed data, and so may not be best searched via ObsCore (but could be in additional tables via TAP in a similar way -- I just see Laurent comment that points in that direction).

I don’t think that it is appropriate to differentiate between them as they are all end-user data products.  The ObsCore Recommendation already supports “Analysis data products generated after some scientific data manipulation or interpretation” as calib_level = 4 (the example quoted is a CTA reconstructed light curve) and also includes a spectral energy distribution (sed) example with calib_level = 3.  Indeed, the latter example is a NED spectral energy distribution, which is a pretty extreme analyzed data product typically constructed from numerous observations and requiring photometric calibrations across multiple wavebands.  I don’t consider aperture photometry MPDFs or position error MCMC draws to be any different in spirit from these examples that are already included in ObsCore.

With regard to the suggestion of using multiple TAP tables, I approach the IVOA from the perspective of a user scientist rather than as a data provider and I’m looking to simplify usage from the user perspective.  Having to figure out which of an ever growing set of TAP tables I need to query (and query yet another set of tables to find their schemas) just to find what to me as a scientist is a coherent set of data products just doesn’t make a lot of sense.

I think there is a significant difference between the optical/IR and HE communities: In the optical/IR one obtains science exposures and calibration exposures (such as biases, dome and sky flats, darks, etc.) and then applies those calibrations to the raw science frames with the result being “calibrated” science exposures.  For data analysis one would very rarely (if ever) go back to the calibration exposures.  That’s not the case for HE data.  Responses and many other ancillary data products are used extensively during data analysis (indeed, HE data analysis would not in general be possible without these products) and should be accessible to the end user in the same way as the original exposures.

BTW, nothing here is new - I discussed these various types of data products at the Malta Interop and asked the audience if they thought we should separate the advanced data products from ObsCore.  The general opinion then seemed to be that they should be kept together and separating them was not a good idea.


> 
> The IRFs would thus come before in the document. I am not sure about the latest choice of "response-functions" for the name, which is quite general. I would say that "instrument-response" in a more appropriate category, then in a vocabulary there would be child terms for each parts or functions for this response.

I think “response-function” is a better choice than “instrument-response” to generalize ObsCore *precisely because* it is a more general concept, and therefore has wider utility across multiple wavebands.  If you want to add “instrument-response” as a child of “response-function” then that is should be OK in a multi-level hierarchy (but perhaps not in a two-level hierarchy like dataproduct_type and dataproduct_subtype).  I don’t like “IRF” because it has historically seen multiple interpretations within the HE community.


> 
> The name of "advanced data products" is not so clear to me. Are they products after analysis of the observed data ? or are they more precise descriptions of the IRFs ?

Advanced data product is a term used in the ObsCore Recommendation for products such as a sed (see section 3.3.1) that may have calib_level >= 3 (see section 4.4) that are the result of “advanced processing” and “may be the result of combining data from multiple primary observations”.


> 
> I think it was proposed by François and other to develop our own vocabulary and hierarchy for HE product types, and we would then associate this vocabulary to product_type, or other TAP tables and attributes. In a way, this would be the follow up of the context data model propose in the HE Note, but with the simple approach of just defining a hierarchical vocabulary.

My concern with this approach is that much of this vocabulary will end up describing concepts that are common across multiple wavebands, particularly for advanced data products that are going to be a major focus of data analysis going forward as we delve into more sophisticated (and compute-intensive) analyses.  I don’t think we would be doing the user community a service by imposing waveband-specific constraints on what are really widespread concepts.


> 
> Best regards,
> 
> Mathieu
> 
> 
> -- 
> Dr. Mathieu Servillat
> LUX - Laboratoire d'étude de l'Univers et des phénomènes eXtrêmes
> Bât 18, Bur. 222
> Observatoire de Paris, Site de Meudon
> 5 place Jules Janssen
> 92195 Meudon, France
> Tél. +33 1 45 07 78 62
> --
> 
> -- 
> heig mailing list
> heig at ivoa.net
> https://www.google.com/url?q=http://mail.ivoa.net/mailman/listinfo/heig&source=gmail-imap&ust=1748875554000000&usg=AOvVaw0GncuhvVa_6W1ulNDY-tpY

Cheers,
—Ian

—

Dr. Ian Evans
Astrophysicist
Chandra X-ray Center
Center for Astrophysics | Harvard & Smithsonian

Office: (617) 496 7846 | Cell: (617) 699 5152
60 Garden Street | MS 81 | Cambridge, MA 02138



 


 <http://cfa.harvard.edu/>cfa.harvard.edu <http://cfa.harvard.edu/> | Facebook <http://cfa.harvard.edu/facebook> | Twitter <http://cfa.harvard.edu/twitter> | YouTube <http://cfa.harvard.edu/youtube> | Newsletter <http://cfa.harvard.edu/newsletter>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/heig/attachments/20250528/3fb949ed/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-2.png
Type: image/png
Size: 581 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/heig/attachments/20250528/3fb949ed/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-3.png
Type: image/png
Size: 21717 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/heig/attachments/20250528/3fb949ed/attachment-0003.png>


More information about the heig mailing list