Vocabularising dataproduct_type

Sarah Weissman sweissman at stsci.edu
Tue Mar 10 16:58:09 CET 2020


Hi Markus,

I think this is great because I'd love to use a shared vocabulary like this to normalize the vocabularies we're currently using in MAST to describe data products. Recently I've been going through an exercise collecting data product type vocabulary (non-controlled) used to describe MAST High Level Science Products and this is the list that I have by frequency of use.

49 Survey
 40 Catalog
 31 Individual Object
 27 Spectral Atlas
 18 Time Series
 11 Image Atlas
 10 Image
  9 Linelist
  3 Composite
  6 Model
  3 SED
  1 Spectrum
  1 Hyperspectral Image

I think some of these are not well-defined, but just looking at this organic list and thinking about how it would map to the proposed IVOA list I have some thoughts:
* Would too many things be mapped into "Measurement" to be useful from a discovery perspective?
* Do we need a separate category for "Model"?
* Do you expect that more than one label would be applied to a data product? For example (naively) could a "spectral image cube" be labeled with "spectrum" and "image" and "cube"?

-Sarah

On 3/10/20, 3:27 AM, "registry-bounces at ivoa.net on behalf of Markus Demleitner" <registry-bounces at ivoa.net on behalf of msdemlei at ari.uni-heidelberg.de> wrote:

    External Email - Use Caution
    
    Hi,
    
    With apologies for the wide crosspost, I'd suggest followups to
    DAL.
    
    In the past few weeks, in two different use cases it was felt
    desirable to have the terms for data product types introduced by
    Obscore outside of Obscore:
    
    (a) as qualifiers in media types (also beyond datalink).
        http://mail.ivoa.net/pipermail/dal/2019-December/008252.html
    
    (b) to declare the sort of data returned from SSAP services,
        http://mail.ivoa.net/pipermail/registry/2020-February/005410.html
    
    In this latter context I've now created a draft vocabulary from
    obscore dataproduct type.  It has draft status at this point, so it's
    still cheap to change definitions, add terms, introduce structure,
    etc.  Or to cancel the entire effort.
    
    The current vocabulary on http://www.ivoa.net/rdf/product-type .
    
    My current plan is have this vocabulary reviewed as part of the
    review of SimpleDALRegExt 1.2.
    
    So... what do you think?
    
    Here are a few points I'd particularly request feedback on:
    
    (a) The vocabulary name: I went for product-type (singular), as the full
        term URI then looks like http://www.ivoa.net/rdf/product-type#image
        or so, which I find nice.  If someone calls for having "data" in
        there (data-product-type or dataproduct-type or whatever), I won't
        quarrel.  I still figure we won't have types of any other sort of
        products and hence saving five characters seems worth the
        deviation from obscore terminology.
    
    (b) I've made the vocabulary largely flat; only "sed" quite clearly is a
        "spectrum".  Do you see more structure in these concepts?
    
    (c) I've streamlined some of the descriptions from Obscore. For
        instance, I've removed the language on formats in the cube
        definition, as it seems somewhat ephemeral, and I've tried to be
        more precise in sed to get its primary characteristic in focus.  And
        I've made the definition for visibility very short -- radio folks,
        complain if you disagree.
    
    Thanks,
    
                Markus
    



More information about the registry mailing list