[EXTERNAL] [BULK] Re: [Heig] vocabulary update: proposal for dataproduct_type update for high energy data : event-list definition and event-bundle
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Mon May 5 08:56:04 CEST 2025
Hi Tess,
On Wed, Apr 30, 2025 at 01:48:29PM +0000, Jaffe, Tess (GSFC-6601) via semantics wrote:
> Possibly dumb question:
Not at all; you're touching a topic that has been discussed quite a
few times now without a satisfying result yet: What does it mean if
there are multiple items with the same semantics in Datalink? Is it
"all of them together give the thing" or is it "they are
alternatives"? Or yet something else?
Previous rounds suggested that the interpretation will probably
depend on the concept, but the details turned out to be fairly messy.
> If an ObsCore table lists an event-bundle as a separate row with
> its own product_type, and the access_url follows best practice
> specifying a datalink that will return the bundle, what should the
> DataLink result include as #this? We are actively putting this
> together now at HEASARC. If the product type is simply a spectrum,
That's excellent news!
> our datalink result has the spectrum file as #this and the response
> matrices, background, etc. as related products in the same result
> table. If the product itself is a bundle, what is the #this? Do
> we have to provide a tarball or something? Or are there multiple
> #this with different dataproduct_subtypes? The latter doesn't
> sound right to me.
Given my preamble, I'd avoid multi-#this. The ideal solution would
IMHO be a standard archive if the HEIG can commit to such a thing.
Failing that, I think a tar archive of the individual components
would be the second best thing. CADC does something like this,
although the other way round: They're handing out everything tarred
together as a #package. Offering the components individually,
possibly as #progenitor-s, would help cases when people really only
want to fetch a single part.
But at least for a prototype (and, if that works fine, perhaps also
as a long-term practice), I think nobody would be terribly confused
if #this were just the time series or spectrum, in particular if a
content-qualifier would let machines figure out what it is they'll
get.
I'm not too happy with #progenitor for the individual components,
though. Perhaps datalink/core should have a concept #component with
the definition "for datasets where #this is composed of multiple
individual artefacts, #component rows offer access to individual
artefacts. Use local-semantics to consistently mark up the roles of
the components." or so.
In the end, I think we need to see what will help clients consuming
this. Do we have software that we could use to try that out? What
do people use to work this #event-bundle-s?
-- Markus
More information about the semantics
mailing list