discussion status & my recommendations

Ray Plante rplante at ncsa.uiuc.edu
Mon May 9 09:24:06 PDT 2005


Hey Kevin,

On Mon, 9 May 2005, KevinBenson wrote:
> The only one I am still unclear about is verificationLevel and not sure if I
> agree with the notion still that any registry can change this value (at
> least not on the vr:Resource).

I think, then, this would make an excellent topic of discussion in Kyoto.  
Let's plan to take it up.  

I take it from your reference to DataQuality that you have seen RMv1.1 
which Bob posted to the list this morning.  For those who have not, see 
http://www.ivoa.net/forum/registry/0505/1406.htm.  This document defines 
ResourceVerificationLevel in full detail.  

The important things to remember about verification are:

 1.  it requires work, particularly human work,
 2.  at its highest level, it is subjective work,
 3.  it should not be done by the person/registry that created the record.  

The lesson we are learning about curation (which is not new!) is that 
people have to take responsibility for it.  

Instead of trying to ensure verification is done uniformily the world
round, let registry administrators take the responsibility at the level
that they can.  Those that put in the most work--that is, taking
verification to level 4 and doing it well--will be the most trusted source
of registry records.  I see this as a Good Thing (see below).  And the
differences will only be relevent to applications that need to use
verificationLevel to select resources.

At the same time, it would be good if we all had similar guidelines of 
quality, and I would like us--the IVOA--to develop, publish, and use a 
common set of guidelines.  If we do, the differences in responses to 
queries using verificationLevel won't be all that great.  

Nevertheless, there may be very good, application-critical reasons to 
have different quality standards at different registries.  For example, 
registry descriptions that are complete enough to use in a 
workflow is very important to Astrogrid.  Your quality standards can
reflect this.  Thus, users building workflows may thus prefer to use the 
Astrogrid's registry.  

> *Other problems that you need to make clear on how this would work out.
> If we state that a registry can keep the same value harvested from another
> full registry then (how or who) do you know gave the original
> verificationLevel?  (Or does this matter?)

You (the user) don't.  It means the registry does not have any local 
quality standards.  This would be the lazy option (and not particularly 
trustworthy).  It would be good for registries to actually document 
their verification practices.  

> Depending on how this is implemented, does this mean that if the
> publishing/original registry user updates a Registry entry to make it great
> going probably from a verificationLevel of 0 to 4, that he/she might have to
> start contacting other full Registries to re-evaluate that entry otherwise
> it might sit at 0 for a long time when it deserves to be a 3 or 4 (that
> would sound like a lot of work by each registry to have to go back and
> re-evaluate entries)

In general, a registry chooses when and how often it might apply some 
automated or human verification; however, when a record is updated, it 
should be re-verified.  A harvester will know this because a new version 
will come through the OAI pipe.  If the originating registry verifies a 
record and changes the verificationLevel, the harvester will see this as 
an update.  

If I were implementing verfication, I would automatically apply the level 
2 checks for every new and updated record; this would essentially reset 
any new/updated record to no higher than level 2.  Then we I did human 
inspection, I would look at all records having verificationLevels of 2 or 
3, perhaps starting with recent updates and/or those with level 2.  

> A final comment is this makes some extra work to change the logic around to
> have the ability to change an external Resource (not owned/managed). And
> then figure out if you should take or not take the verificationLevel from
> another full registry. But yes probably could be done with not to much
> headache.

I would expect that the choice of accepting the verificationLevel set by 
the provider is a policy one made up front and not incorporated into your 
harvesting software (although you are free to do so).  That is, either you 
decide to blindly accept what is there for all records, or you do the 
verification yourself and score it appropriately independent of what the 
provider says (it's not really their call).  

If a registry does not wish to do any verification, then it really SHOULD 
just set the records to verificationLevel=0 which is what that level 
means.  In practice, I imagine that it would be hard not to at least do 
a level 1 check, which ensures compliance with the schemas.  

cheers,
Ray




More information about the registry mailing list