[Ops] Draft concept for standard validator output

Tom McGlynn (NASA/GSFC Code 660.1) tom.mcglynn at nasa.gov
Thu May 12 22:04:54 CEST 2016


I've littered this message with responses.  While I am kind of excited 
by the possibilities that hierarchies might afford, and I hope that 
shows below, getting some level of agreement is more important than any 
particular framework.
     Tom

Mark Taylor wrote:
> On Thu, 12 May 2016, Markus Demleitner wrote:
>
>> On Thu, May 12, 2016 at 03:02:01AM -0400, Tom McGlynn (NASA/GSFC Code 660.1) wrote:
>>> Attached find a sketch of a possible standard output format for validators.
>>> It uses a relatively simple but recursive structure to accommodate both
>> I refer to my personal Gospel, the Zen of python:
>>
>>    $ python -c "import this" | grep nested
>>    Flat is better than nested.
> I tend to agree with Markus in this case that flatness is preferable,
> though I might be persuaded otherwise (it's more a call for validator
> consumers than validator producers).

I think this is one key point.  That validator writer should always be 
perfectly at liberty to write a linear array of test statuses. The issue 
is whether we might want to organize the messages in some way and if so 
would a validator consumer want to see such a structure.  Were I 
rendering a report from a service like TAPLint which may have a myriad 
different errors, being able to provide some structure may make it much 
easier for users to understand what is going on.  I'd also suggest that 
TAPlint exemplifies a pretty standard approach to validation: we break 
up validation into parts and do a set of tests for each part.

In three of the four use cases the validation is strongly structured.  
That's manifest when we are validating a sequence of services and I've 
discsussed TAPlint above, but it's also true for all of the simple DAL 
validators.  These do at least three sets (in ESA's case four) where 
they run different queries (a standard query, a metadata query, a 
erroneous query, a null query) and run a series of tests on each query.

When presenting the results it might be nice to be easily able to 
summarize it as:
      Standard query:  0 fail
       Error query:       2 fail
       Metadata query: 0 fail

Of course we don't want to mandate that validators provide an 
organization that allows this, but I'd like to leave the door open!

>
>> I think all of these can be essentially be represented as a sequence
>> of
>>
>>    <test_id>, <test_description>, <test_result>, [<see_also>]
I think those are name, resultText, result [and protocolSection] 
presumably presumably in the first sublevel.
>>
>> (where <see_also> could be a link to the spec or a page discussing
>> how to fix things) plus some "global" (e.g., VOTable PARAM)
>> parameters:
>>
>> <validator id>, <validator description>, <global result>
Other than validator description I think all of those are also included, 
global result is just result at the top level.
>>
>> with <global result> one of success, warnings, errors (or whatever).
> The <test-id>,<test-description>,<test-result>}[] model doesn't fit
> too well with the way that I write validators.
> I don't really think in terms of testing a feature,
> determining pass/fail, and moving on.  Instead I hit the service
> in whatever ways I can think of, look for failures or anomalies,
> and report them.
>
> So typically my testing of a given behaviour or standard requirement
> is not done in one place, it is distributed over large parts of
> the validator execution.  For instance when votlint (standalone or
> within taplint) validates a VOTable, every time it encounters a
> TD element it checks that the content matches the corresponding
> FIELD declaration.  Clearly, it's not a good idea to emit a
> test=pass message each time.
Absolutely.  There no reason to emit a test=pass message ever, unless 
the validator provider wants to.

> It would be possible for me to gather all the messages in that
> category for a single later report but (a) it would be a pain in
> implementation terms and (b) it could impact on scalability
> (streaming very large VOTables).

What you're saying here is the you wouldn't want to provide a hierarchy 
where the error messages were sorted by type.  You
might want a hierarchy where the messages are by RESOURCE or whatever is 
natural.  You may emit classes of errors in some potentially repeating 
sequence of resources.  Or if these is no obvious structure then you 
would simply emit messages as they seem appropriate.
>
> What would work for me?
I'm happy to use these names.
>
>     message-type (info, error, warning, ...)
aka, status
>     message-ID (implementation specific, some repeatable tag)
aka, name
>     message-text (human-readable, free-form)
aka, resultText
>     see-also (structured? pointer to standard name, version?, section?)
aka, protocolSection
> This does not preclude the feature-test->report model, since one
> could write
>
>     message-type = PASS
>     message-ID = ....
>     message-text = Tested feature XXX, no problems
>     see-also = VOFoo, v3.9, sec 10.1.1
>
> but a straight map of taplint or votlint into this model would not
> emit many message-type=PASS messages.
Nor need it.  In defining this structure we are making no statement 
about what messages a validator provide is going to provide.
> Mark
>
> -

One of the things that I really like about recursive idea, is that I can 
make composite validators trivially.  E.g., we're clearly not doing 
enough checking of the VOTable format in the NCSA/HEASARC validators.  
An obvious solution would be to run VOTlint as well as the DAL validator 
code.  If I allow a hierarchy of the type I suggest, then the report 
from this validator is practically nothing but the separate reports of 
the two validators copied byte-by-byte followed prefixed by ~5 lines 
that show the assessment of the tests combined.

Or suppose I were to want to build a registry validator and also 
validate all of the IVOIDs in the registry.  For each registry entry I 
might have some special tests, but I'd be able to glom in Markus' IVOID 
validator results.

One could archive the validation state of a given site or of all 
registered VO resources.

In the other direction when testing validators, we may be easily able to 
focus on the error messages emitted by changed code if these messages 
are associated with a given group.

This ability to composite registry falls out without any work on our 
part.  This opens a myriad possibilities for people to reuse existing 
validators.  That might make it easier to focus validator development on 
key new features in standards and use battletested existing tools for 
the underlying protocols.

But suppose you hate reading hierarchies and want to get a table. As I 
discussed with Markus a few lines of JavaScript (for JSON) or XSLT (for 
XML) can easily render the deepest hierarchy as a table. I hate writing 
XSLT myself, but I think even I could put a little tool on the web to do 
this.  [I don't mind write JavaScript at all -- so this would be one 
area where I'd prefer JSON.]

Just a few thoughts on why I like the idea of an [optionally] 
hierarchical format.




More information about the ops mailing list