[Heig] vocabulary update: proposal for dataproduct_type update for high energy data : event-list definition and event-bundle

Wed May 28 22:59:13 CEST 2025

Dear Colleagues,

On Wed, May 28, 2025 at 10:24:44PM +0200, BONNAREL FRANCOIS gmail via semantics wrote:
> A query with  a constraint such as "WHERE ivoa_smaller(dataproduct_subtype,https://www.ivoa.net/rdf/responsefunction_type#response-function")
> should validate for #psf, #lsf, #arf, etc....

Let me just mention that if there were such a response-function
vocabulary, you could already write

  WHERE 1=gavo_vocmatch(
    'response-function',   -- the vocabulary name
    'response-function',   -- the concept name
    dataproduct_subtype)   -- the column to match against

on http://dc.g-vo.org/tap (and other DaCHS services) and this would
be true for all terms narrower than that #response-function.  This
isn't actually hard to implement, so I'd be confident we could turn
this into ivo_vocmatch (i.e., an interoperable UDF) rather easily.

If you're looking for something to try it out, try:

SELECT TOP 5 * FROM ivoa.obscore
WHERE
  1=gavo_vocmatch('product-type', 'spectrally-resolved-dataset', dataproduct_type)

on http://dc.g-vo.org/tap.  Ahem. Right now, there's only #spectrum
coming back, but that's only because I'm not yet marking up the
califa cubes as spectral-cubes.  Which I'll do as soon as an obscore
WD is out that sanctions using product-type terms in the
dataproduct_type column.

Anyway, if we decide that we need the narrower terms, I think I'd
prefer to put them into dataproduct-type rather than create an extra
vocabulary.  That's mainly to keep the vocabulary ecosystem as small
as we can; in this new view on the response functions, these *are*
(perhaps somewhat odd) sorts of data products, after all.

Oh, and while I'm talking: Ian's post this morning my time about the
numbers of matches for Chandra's response functions does suggest we
may want the narrower terms (#irf and friends); on the other hand,
the 6'010 rows for #rmf is probably still too much for efficient
discovery and as such not a *tremendous* progress over the 104'740
you'd get for #response-function.

So... What would people do to pick the rmf they actually need from
the obscore result?  Or do they really need them all?  And wouldn't a
possible extra constraint cut down the #response-function result to a
reasonable size, too?

Thanks,

             Markus