RFC Provenance Data Model

Harry Enke henke at aip.de
Wed Aug 21 18:45:07 CEST 2019


Dears,
while reading the document, I ran into a couple of statements, which 
seem unclear or misguiding, so here's my 2c:

Intro,para2: (p1.)
"The provenance of scientific data is a part of the open publishing 
policy for science data and follows some of the FAIR principles for data 
sharing."

I suggest to substitute this sentence with:
"Provenance of scientific data is one components of the FAIR 
principles:  "(4.3) Published data should refer to their sources with 
rich enough metatdata and provenance to enable proper citation".    
(Force11, FAIR Principles)
This provenance model goes beyond the FAIR principles, its intent is to 
make the generation of astronomical data accessible. "

=> This sentence is confusing and slightly misleading: there is no "open 
publishing policy", because policies require a body which gives this 
policy to itself and
implements some enforcement measures. And why not pointing to the direct 
reference in  FAIR ? It's good we have some.


"In astronomy, such entities are generally datasets composed of 
VOTables, FITS files, database tables or files containing values 
(spectra, light curves), any value, logs, documents, as well as physical 
objects such as instruments, detectors or photographic plates."

I suggest to clarify:
"In astronomy, such entities are generally datasets composed of 
VOTables, FITS files, database tables or files containing values 
(spectra, light curves), any value, logs, documents, as well as 
descriptions of physical objects such as instruments, detectors or 
photographic plates, or information about software."

=> (see paragraph above: 'information about...)
a) not the machinery itself, but the description can be included as 
provenance information
b) I think that software (even though it's not explicitly addressed) 
also could have provenance information


"General Remarks:" (p8)
"Another important usage of provenance information is to assess the 
pertinence of a product for scientific objectives, which can be 
facilitated through the selection of the relevant provenance information 
attached to an entity that is delivered to a science user."
I suggest:
"Provenance information delivers additional information about the 
scientific data set to enable the scientist to evaluate its relevance 
for his work. "

=> This is hard to understand,if one substitutes "pertinence" with
"relevance" (because they are synonyms) you get a kind of tautology
(relevance of a product => select relevant provenance information)


Best Practices: (p.9)
"The following additional points are recommended when managing 
provenance information within the VO context:"
should be
"The following additional points are recommended when providing 
provenance information within the VO context:"

=> since all following statements are clearly for providers of 
provenance infomation

  13. Role ...   (p. 10)
"The IVOA Provenance Data Model is structuring and adding metadata to
trace the original process followed during the data production for 
providing astronomical data. "
should read:
"The IVOA Provenance Data Model is structuring and adding metadata to
trace the processes of the data production for providing astronomical data."

=>It's neither reasonable nor required to restrict the provenance to 
'original processes'.

Best ,
Harry Enke


-- 

******************************************************************
* Dr. Harry Enke                      E-Science & Supercomputing *
*                                       Phone : +49-331-7499-433 *
* Email : henke at aip.de                  FAX   : +49-331-7499-526 *
******************************************************************
* Leibniz Institut für Astrophysik Potsdam  (AIP)                *
* An der Sternwarte 16,                     D-14482 Potsdam      *
* Vorstand: Prof. Dr. Matthias Steinmetz, Matthias Winker        *
* Stiftung bürgerlichen Rechts                                   *
* Stiftungsverzeichnis Brandenburg:         26 742-00/7026       *
******************************************************************



More information about the dm mailing list