[Heig] [EXTERNAL] [BULK] Re: vocabulary update: proposal for dataproduct_type update for high energy data : event-list definition and event-bundle

Mireille Louys mireille.louys at unistra.fr
Mon May 5 19:50:05 CEST 2025


Hello,

Thanks Tess and Laurent for these examples .

This was a proposal during the HE workshop last week to bring some examples

showing an Obscore data discovery ( obscore entry here ) and exploring 
various settings of data link scenarios .

Any volunteer for  examples from other archives ?

Best , Mireille


Le 05/05/2025 à 18:14, Laurent Michel via heig a écrit :
> Hello,
>
>
>
> Le 05/05/2025 à 08:56, Markus Demleitner via semantics a écrit :
>> Hi Tess,
>>
>> On Wed, Apr 30, 2025 at 01:48:29PM +0000, Jaffe, Tess (GSFC-6601) via 
>> semantics wrote:
>>> Possibly dumb question:
>>
>> Not at all; you're touching a topic that has been discussed quite a
>> few times now without a satisfying result yet: What does it mean if
>> there are multiple items with the same semantics in Datalink? Is it
>> "all of them together give the thing" or is it "they are
>> alternatives"?  Or yet something else?
>>
>> Previous rounds suggested that the interpretation will probably
>> depend on the concept, but the details turned out to be fairly messy.
>>
>>
>>> If an ObsCore table lists an event-bundle as a separate row with
>>> its own product_type, and the access_url follows best practice
>>> specifying a datalink that will return the bundle, what should the
>>> DataLink result include as #this?  We are actively putting this
>>> together now at HEASARC.  If the product type is simply a spectrum,
>>
>> That's excellent news!
>>
>>> our datalink result has the spectrum file as #this and the response
>>> matrices, background, etc. as related products in the same result
>>> table.  If the product itself is a bundle, what is the #this? Do
>>> we have to provide a tarball or something?  Or are there multiple
>>> #this with different dataproduct_subtypes?  The latter doesn't
>>> sound right to me.
>>
>> Given my preamble, I'd avoid multi-#this.  The ideal solution would
>> IMHO be a standard archive if the HEIG can commit to such a thing.
>> Failing that, I think a tar archive of the individual components
>> would be the second best thing.  CADC does something like this,
>> although the other way round: They're handing out everything tarred
>> together as a #package.  Offering the components individually,
>> possibly as #progenitor-s, would help cases when people really only
>> want to fetch a single part.
>
> I agree that having multiple #this is confusing (which one is the good 
> one.??).
> In my understanding #this must match the product_type as in the 
> Obscore record.
>
> If a spectrum bundle is exposed in a separate row, we should have 
> something like this:
>
> Obscore row:
> -----------
> - product_type=spectrum-bundle (tbd)
> - access_format=application/x-votable+xml;content=datalink
>
> Datalink response:
> -----------------
> - link #1
>   - semantics=#this
>   - content_qualifier=spectrum-bundle (TBD)
>   - content_type=application/tar+gzip
>   - description="spectrum file + preview + ARF + RMF + Background 
> spectrum"
>
>
> If the spectrum is exposed in a separate row:
>
> Obscore row:
> -----------
> - product_type=spectrum
> - access_format=application/x-votable+xml;content=datalink
>
> Datalink response:
> -----------------
> - link #1
>   - semantics=#this
>   - content_qualifier=spectrum
>   - content_type=application/fits
>   - description="spectrum file"
> - link #2
>   - semantics=#package
>   - content_qualifier=spectrum-bundle (TBD)
>   - content_type=application/tar+gzip
>   - description="spectrum file + preview + ARF + RMF + Background 
> spectrum"
>
> I do not believe we are able to design a standard HEIG archive because 
> this too much mission/tool specific.
> Do we really needs the archive content to be machine readable?
> Anyway, individual files can be exposed with an adapted semantics.
>
> Laurent
>
>
>>
>> But at least for a prototype (and, if that works fine, perhaps also
>> as a long-term practice), I think nobody would be terribly confused
>> if #this were just the time series or spectrum, in particular if a
>> content-qualifier would let machines figure out what it is they'll
>> get.
>>
>> I'm not too happy with #progenitor for the individual components,
>> though.  Perhaps datalink/core should have a concept #component with
>> the definition "for datasets where #this is composed of multiple
>> individual artefacts, #component rows offer access to individual
>> artefacts.  Use local-semantics to consistently mark up the roles of
>> the components." or so.
>>
>> In the end, I think we need to see what will help clients consuming
>> this.  Do we have software that we could use to try that out? What
>> do people use to work this #event-bundle-s?
>>
>>           -- Markus
>>
>
> -- 
> English version: https: //www.deepl.com/translator

-- 
--
Mireille Louys, MCF (Assistant Professor)
Centre de données Astronomiques (CDS)       Equipe Images, ICube
Observatoire de Strasbourg                  Telecom Physique Strasbourg
11, rue de l' Université                    300, Bd Sebastien Brandt CS 10413
F-67000 Strasbourg                          F-67412  Illkirch Cedex



More information about the heig mailing list