Vocabularising dataproduct_type

Baptiste Cecconi baptiste.cecconi at obspm.fr
Tue Mar 10 10:02:04 CET 2020


Just a small addition here (after comments received off-list): I don't propose to use the 2-letter code in the vocabulary, but I let you know the list of dataproduct_types we use. 

Cheers
Baptiste

> Le 10 mars 2020 à 09:06, Baptiste Cecconi <Baptiste.Cecconi at obspm.fr> a écrit :
> 
> Hi Markus,
> 
> In EPNcore, we have a slightly more extended list of dataproduct_types as compared to Obscore. You can check the list here (we currently use a 2 letter code but the plain names are given, with some detailed statements):
> 
> im (= image): scalar field with two spatial axes, or association of several such fields, e.g., images with multiple color planes, from multichannel or filter cameras. Preview images (e.g. map with axis and caption) also belong here. Conversely, vectorial 2D fields are described as spatial_vector (see below).
> ma (= map): scalar field / rasters with two spatial axes covering a large area and projected either on the sky or on a planetary body, associated to spatial_coordinate_description and map_projection parameters(with a short enumerated list of possible values); each pixel is associated to 2D coordinates. This is mostly intended to identify radiometrically calibrated and orthorectified images with complete coverage that can be used as reference basemaps. 
> sp (= spectrum): measurements organized primarily along a spectral axis, e.g., radiance spectra. This includes spectral aggregates (series of related spectral segments with non-connected spectral ranges, e.g., from several channels of the same instrument, various orders from an échelle spectrometer, composite spectra, etc).
> ds (= dynamic_spectrum): consecutive spectral measurements through time, organized primarily as a time series. This typically implies successive spectra of the same target / field of view.
> sc (= spectral_cube): sets of consecutive spectral measurements with 1 or 2D spatial coverage, e.g., imaging spectroscopy. The choice between image and spectral_cube is dictated by the characteristics of the instrument (which dimension is most resolved & which dimensions are acquired simultaneously). The choice between dynamic_spectrum and spectral_cube is related to the uniformity of the field of view and by practices in the science field.
> pr (= profile): scalar or vectorial measurements along 1 spatial dimension, e.g., atmospheric profiles, atmospheric paths, sub-surface profiles, traverses…
> vo (= volume): measurements with 3 spatial dimensions, e.g., internal or atmospheric structures, including shells/shape models (3D surfaces).
> mo (= movie): sets of chronological 2D spatial measurements (consecutive images)
> cu (= cube): multidimensional data with 3 or more axes, e.g., all that is not described by other 3D data types such as spectral cube or volume. This is intended to accommodate unusual data with multiple dimensions. This can be used for 3D ancillary data associated to spectral cubes, e.g., providing the coordinates or illumination angles for each spectrum.
> ts (= time_series): measurements organized primarily as a function of time (with exception of dynamical spectra and movies, i.e. usually a scalar quantity). Typical examples of time series include space-borne dust detector measurements, daily or seasonal curves measured at a given location (e.g. a lander), and light curves.
> ca (= catalogue): applies to a granule providing a catalogue of object parameters, a list of features, a table of granules in another TAP service, a list of events... The result metadata table of a service can be considered as a catalogue. Catalogues can be provided as VOtable (possibly containing multiple tables, although this is not supported by SAMP). It is good practice to describe the type of data included in the catalogue using a hash-list (e.g., a table of spectra should be described by ca#sp, so that it will respond to a query for spectra).
> ci (= catalogue_item): applies when the service itself provides a catalogue with entries described as individual granules, in particular when there is no associated file (e. g., a list of asteroid properties or spectral lines). Catalogue_item can be limited to scalar quantities (including strings), and possibly to a single element. This organization allows the user to search inside the catalogue from the TAP query interface.
> sv (= spatial vector): vector information associated to localization, such as a spatial footprints, a GIS-related element, etc —  e. g. a klm or geojson file (STC-S strings are provided though the s_region parameter, though). This includes maps of vectors, e.g., wind maps.
> ev (= event): introduces individual VOevents formatted according to IVOA standard (or possibly events with other formatting, TBC)
> Looking at your proposed list, we don’t have Visibility, nor SED. I think our CatalogueItem is close to your Measurement. Then we have extra terms. All the terms listed here are used in at least 1 EPN-TAP service.
> 
> The reason the our choice to use a 2-letter code rather than an explicit name is the following: we have several examples of composite derived products, containing, e.g., a series of times series and dynamic spectra (various spectral intégrations and various polarizations). In this case the dataproduct_type is set to ts#ds. We can also use this for services providing catalogues of images, which would then described as ca#im
> 
> Baptiste
> 
> 
>> Le 10 mars 2020 à 08:28, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> a écrit :
>> 
>> Hi,
>> 
>> With apologies for the wide crosspost, I'd suggest followups to
>> DAL.
>> 
>> In the past few weeks, in two different use cases it was felt
>> desirable to have the terms for data product types introduced by
>> Obscore outside of Obscore:
>> 
>> (a) as qualifiers in media types (also beyond datalink).
>>    http://mail.ivoa.net/pipermail/dal/2019-December/008252.html
>> 
>> (b) to declare the sort of data returned from SSAP services,
>>    http://mail.ivoa.net/pipermail/registry/2020-February/005410.html
>> 
>> In this latter context I've now created a draft vocabulary from
>> obscore dataproduct type.  It has draft status at this point, so it's
>> still cheap to change definitions, add terms, introduce structure,
>> etc.  Or to cancel the entire effort.
>> 
>> The current vocabulary on http://www.ivoa.net/rdf/product-type .
>> 
>> My current plan is have this vocabulary reviewed as part of the
>> review of SimpleDALRegExt 1.2.
>> 
>> So... what do you think?
>> 
>> Here are a few points I'd particularly request feedback on:
>> 
>> (a) The vocabulary name: I went for product-type (singular), as the full
>>    term URI then looks like http://www.ivoa.net/rdf/product-type#image
>>    or so, which I find nice.  If someone calls for having "data" in
>>    there (data-product-type or dataproduct-type or whatever), I won't
>>    quarrel.  I still figure we won't have types of any other sort of
>>    products and hence saving five characters seems worth the
>>    deviation from obscore terminology.
>> 
>> (b) I've made the vocabulary largely flat; only "sed" quite clearly is a
>>    "spectrum".  Do you see more structure in these concepts?
>> 
>> (c) I've streamlined some of the descriptions from Obscore. For
>>    instance, I've removed the language on formats in the cube
>>    definition, as it seems somewhat ephemeral, and I've tried to be
>>    more precise in sed to get its primary characteristic in focus.  And
>>    I've made the definition for visibility very short -- radio folks,
>>    complain if you disagree.
>> 
>> Thanks,
>> 
>>            Markus

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20200310/d7535641/attachment-0001.html>


More information about the dal mailing list