Provenance team reply on the Agent roles

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Dec 16 10:15:11 CET 2019


Dear Laurent,

On Sat, Dec 14, 2019 at 06:34:23PM +0100, Laurent MICHEL wrote:
> The agent role is specified as free text that should use the given
> vocabulary. This feature is important to describe and compare Provenance
> instances within one project but it is not always sufficient to match
> instances from different sources. This would require a field with a
> controlled vocabulary.

Enabling interoperable use (which is "from different sources") is not
the only reason for wishing a clear and well-defined vocabulary --
with the current proposed set of terms, annotators will also have a
hard time picking terms, as the meanings severely overlap, and the
definitions are insufficient to disambiguate them.

This will become painful as soon as some software client actually
wants to do interoperably interpret provenance information.  If you
feel that's a flaw you want to live with, that's fine.  I'd still
prefer it if things that you're not willing to define interoperably
weren't part of the standard in the first place -- it's always easier
to add something to a standard than to take something away.

Be that as it may: I'd hope interoperable interpretation of
provenance can live without agent-role, and so, again, semantics
won't block provenance.

But, very importantly:

> It is to be noted that there is no global VO reference dictionary from which
> we could get standard definitions for the 4 items we borrowed from others VO
> standards (Publisher, Provider, Creator and Contributor). This could get
> better once the future Vocabularies2.0 will be a REC.

No, it won't.  The Vocabularies in the VO standard defines a
*framework* in which to develop vocabularies in a way that, I hope,
will make that a process as painless as possible.  It will *not*
define any vocabularies.  On the contrary: it is the express goal of
the endeavour that we no longer define terms just because we think
we might need them one day -- that usually ends in deprecation and
confusion --, but to define terms as a concrete need shows ("only
scratch where there's an itch").

That means that without groups willing to spend the extra effort to
make their word lists orthogonal and well-defined, exactly nothing
will improve.

Having said that, as I take off my semantics hat and turn into a
Registry activist: What exactly would you have expected as "standard
definitions" of Publisher or Creator?  VOResource (where these come
from) says in its schema:

Creator:
  The entity (e.g. person or organisation) primarily responsible for
  creating something

Publisher:
  Entity (e.g. person or organisation) responsible for making the
  resource available

They're certainly not optimal, yes, but in particular the definition
of publisher is, I feel, a good start.  

"Provider", on the other hand, isn't a term I'm familiar with in a VO
context -- where have you seen it?  I'd pretty much say that should
be "Publisher", no?

Finally, Contributor... Well, that's a painful thing even in
VOResource, and I seem to remember having asked around on the mailing
list for what we should do with it in the preparation for VOResource
1.1 (without any success if memory serves).  

DataCite has a definition for it contributor might work, but that
then has large overlaps with our existing creator and publisher
terms.  Hence, I believe we first need to understand what we'd like
contributor to do ("use cases") and then figure out how to bring that
in line with DataCite or other external reference points.

So, yes, the situation might justifiably be called something of a
mess.  But that will only improve if people start identifying sane
subsets (which IMHO creator, publisher, and contact are: we
understand what they denote and how they are to be used) and then
successively grow these islands of clarity.

Agent-role would have been a chance to make that happen.  And,
donning my semantics hat again, *I* would still like to use that
chance.

        -- Markus


More information about the semantics mailing list