Data Source vocabulary

Stéphane Erard stephane.erard at obspm.fr
Sun Jun 7 12:06:47 CEST 2026


Dear all

As discussed this morning in the TCG, here is a suggestion about Data Source vocabulary: 

The current proposal distinguishes: Artificial / Observation / Theory
- The distinction between "modelling the world without reference to specific individual world objects" and "predicting observations of specific individual world objects" is probably not relevant in many cases.
- The need to distinguish observations from made-up data is common to every field (associated to the opposition noise vs modeling uncertainty)
- The need to distinguish data with physical content from 'artificial' data is also sensible, although the distinction may depend on the discipline
- It is stressed that the word "Theory" is often misleading in the VO (at least another similar comment this morning!) — in most cases it refers to computations / modeling, which is different and more specific

Proposal (not yet formalized)
- "Observation" is reserved to actual measurements and derived data. This includes made up samples in the lab, which implies that another level is available to characterize the source itself (lab measurements in general are often distinguished with Experimental, as opposed to Observation)
- "Modeled" refers to computed data generalizing observations — typically large models (universe, planetary / stellar atmospheres or interiors…), which are explored to grab scientific meaning. This includes the current case of no attempt to simulate existing objects, but not only (also models with parameters tuned to fit the real world).
- "Simulation" refers to mock-up data with mostly practical value, typically much less computation-intensive (ie : simulation of an instrument output, first level simulation of a source, etc). According to this definition, this is where you would find properties averaged over a class of objects (eg. stellar class spectra, asteroid type spectra) — which is debatable (but those are often difficult to characterize, as there is no associated source)

But I stress that the need to have both Modeled and Simulation has to be demonstrated, especially if the distinction ifs not always clear and obvious.

Best wishes
Stéphane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/semantics/attachments/20260607/9da1832b/attachment.htm>


More information about the semantics mailing list