FAIR semantics

Mon Jun 7 11:45:45 CEST 2021

Dear Markus,

Many thanks for the check of what they propose. They are inviting people 
to comment, and I guess that they would be interested to get feedback 
from us since we have been managing vocabularies for a while. For them 
it is a test of the applicability of what they propose in a real case 
which has not been involved in the definition of the recommendations. 
Maybe comments prepared by the Semantics WG chair and vice-chair, and 
members would be interested to participate if they show up now? It is 
useful for me to know about these things, since I have occasions to 
speak about FAIR in astronomy in a number of different contexts.

Cheers
Francoise

Le 07/06/2021 à 11:12, Markus Demleitner a écrit :
> Dear Françoise,
>
> On Fri, Jun 04, 2021 at 11:40:09AM +0200, Francoise Genova wrote:
>> As you know the FAIR principles are currently a hot topic, and requiring
>> that data produced by projects is FAIR is on the agenda of many projects.
>> The I2 FAIR guiding principles from Wilkinson et al. states that '(meta)data
>> uses vocabularies that follow FAIR principles'. There is ongoing work to
>> define what a 'FAIR vocabulary' means, in particular in the FAIRsFAIR
>> project and in the RDA Vocabulary and Semantic Services IG (VSSIG). The
>> current version of the FAIR Semantics Recommendations is here:
>> https://doi.org/10.5281/zenodo.4314321
> I've had a look at an earlier version of that paper in the run-up to
> Vocabularies 2.0, and I think we're doing pretty good.  Below is a
> short run-down of the requirements from the current version and how I
> think we stand on them for VocInVO-compliant vocabularies.
>
> Do you think there is any need to follow up on this in some way at
> this point?
>
>          -- Markus
>
>
> Requirements from https://doi.org/10.5281/zenodo.4314321
>
> P-Rec. 1: All our vocabularies are uniquely identified by
> http://www.ivoa.net/rdf/<vocname>; they resolve to
> human-readable vocabulary descriptions and the term lists by default,
> to machine-readable RDF resources by content negotiation as per W3C
> best practices otherwise.
>
> P-Rec. 2: If I understand correctly the aim of this requirement, it
> is about something like our registry records per vocabulary that can
> be retrieved without retrieving the whole vocabulary.  That we do not
> have (the vocabulary metadata is present machine-readably in the
> files, though, and our files are compact enough that a harvester
> wouldn't be overloaded either way).  If and when a standard is
> created that defines how the metadata record should be serialised, it
> should be simple to produce it from the in-vocabulary metadata; until
> then, I would claim our vocabulary repo index at
> http://www.ivoa.net/rdf/ is about the best we can do.
>
> P-Rec. 3: This minimum metadata is defined for us by Vocabularies
> 2.0.  Adopting some external schema is mainly a matter of picking
> one; that, I think, is an implementation issue mainly depending on
> something wanting to consume some specific form of such metadata.
>
> P-Rec. 4: I'd hope ivoa.net counts as trustworthy :-)
>
> P-Rec. 5: This is not very concrete at this point; I would argue that
> by following the W3C best practices we are good on this.  The more
> complex APIs hinted at appear to be about offering clients ways to
> edit resources, which is not within our use cases.
>
> P-Rec. 6: Since we only operate a single repo, this does not apply to
> us.
>
> P-Rec. 7: Since our resource ids use the http schema, this is not
> trivially satisfied, and changing this would break existing terms and
> specifications.  However, all vocabularies can be retrieved through
> HTTPS (as is necessary to make them usable from current client
> javascript), so I'd say we're good here, too.
>
> P-Rec. 8: That's Vocabularies in the VO 2.
>
> P-Rec. 9: We're producing RDF/XML and Turtle, based on RDFS and SKOS.
>
> P-Rec. 10: I *suppose* there's room for improvement here (but then
> it's optional).
>
> P-Rec. 11: That's fairly far beyond what we're doing at the moment
> (and it's optional).
>
> P-Rec. 12: We're already using skos:exactMatch in our UAT, and
> similar devices are envisioned for a vocabulary of facilities and
> instruments; so, for now I'd claim we're in the realm of the "In many
> cases" language in this requirement.
>
> P-Rec. 13: Whenever this becomes more formal, this would be covered
> in the adoption ENs.
>
> P-Rec. 14: We are re-using SKOS and RDFS; let's see what else becomes
> useful.
>
> P-Rec. 15: This currently applies to the UAT, where we're doing it
> with skos:exactMatch.  We will have a similar mechanism for
> SIMBAD-derived object types.
>
> P-Rec. 16: Our own vocabularies are CC-0, other licenses are
> possible for externally managed vocabularies (where I suppose we
> won't adopt them if they're un-FAIR).  They are declared in both
> human-readable and machine-readable ways.
>
> P-Rec. 17: We use VEPs for that (which of course are not terribly
> machine-readable yet); formal provenance for freshly-created
> vocabularies we don't have (yet).  Human-readably, the IVOA RFC
> process should provide sufficient provenance there, though.
>
>
>
> On the best practices -- well, I agree on most of them, but as usual
> with best practices, there's limits.  For instance, while VocInVO2 is
> nudging people to use lower-case-with-dashes terms (that's BP.1),
> that doesn't always work.  refframe, for instance, needs to
> accomodate the lexical form from VOTable, and relationship is
> well-advised to just adopt the lexical forms of DataCite.  For BP.6,
> we've tried to provide some guidelines in
> https://ivoa.net/documents/Vocabularies/20210525/REC-Vocabularies-2.0.html#tth_sEc5.2.4,
> but again having this reflected in the actual discussion processes is
> not always easy.
>
> BP.8 is of course my personal big desideratum.  You wouldn't believe
> how hard that is.