GWS Splinter: XSD Schema Evolution

Mark Taylor M.B.Taylor at bristol.ac.uk
Thu Jul 2 23:44:32 CEST 2015


Hi grid,

I'm not sure if this should be part of the UWS 1.1 RFC or part of
the "XSD Schema Evolution" discussions within the GWS WG,
so I'm posting it to the grid list anyway.

I wasn't in the schema evolution splinter meeting at Sexten,
but I see this summary, which basically looks like good sense:

On Thu, 18 Jun 2015, Brian Major wrote:

> Hi grid,
> 
> I'm happy to say that the discussion on XSD schema evolution resulted in
> agreement on method.  In a nutshell:
> 
> 1.  Clients are do to best-effort parsing (discourage schema validation)
> 2.  The XSD namespace is to be that of the major revision (x.0)
> 3.  For version discovery, the precise version shall be included in the
> root element of the XSD and XML documents.
> 4.  Tests & validators are still to use XML validation
> 
> Paul Harrison has graciously offered to write up a note on our discussions
> that will be sent out for review.
> 
> Brian

But I can see at least one negative consequence of this, and I'm not
sure whether it's intended.

Looking at the UWS 1.1 PR currently in RFC, I see the preamble to
Appendix B (the schema):

   "Note that this schema can be found on-line at
    http://www.ivoa.net/xml/UWS/v1.0 (i.e. the target namespace can also
    be used as a URL for the schema.) This location should represent the
    definitive source, the schema is reproduced below only for completeness
    of this document."

That text, and the URL, have not changed since UWS 1.0, but the
content of the schema has.  I think this is in accordance with the
schema evolution manifesto above.

Now the namespace stays the same between versions, but the
content of the schema at the standard URL holding the schema
changes at the epoch when the new version is Recommended.
That means that UWS documents that are
currently valid relative to the schema with the namespace
http://www.ivoa.net/xml/UWS/v1.0, retrievable from
http://www.ivoa.net/xml/UWS/v1.0, may, following adoption of
UWS 1.1 as per the current PR, become invalid.
It's not the same schema between 1.0 and 1.1, but it's pretty
hard to work out how to refer to it in a way that makes that clear.

I haven't been through the diff of the 1.0 and 1.1-PR versions
of the schema with a fine toothcomb, but I can see at least
one incompatibility: in the 1.1-PR version, the "version"
attribute of the job and jobs elements is required, so
older instances of this document (which lack that attribute)
are invalid against the new version of the schema.

Is this the intention, or at least an acknowledged consequence
of the schema evolution manifesto?

Note that not all schema changes need be harmful in this way
(invalidating previously valid document instances).
For instance the introduction of the new ARCHIVED ExecutionPhase
enumeration element in UWS does not invalidate any old XML instance.
The "version" attribute could be de-fanged in this respect by
making it optional in the 1.1 schema.  Would it be a good idea
to:

   (a) make the job/jobs "version" attribute optional in UWS 1.1
       (its absence could reasonably be interpreted as intending v1.0)

   (b) try to ensure that other UWS 1.0->1.1 changes are similarly
       harmless as far as invaliadating existing 1.0 documents go
       (I'm not sure if this is in fact feasible)

and generally adopt these practices for minor version updates of
other schemas?

If the answer is just: we really don't care about schema validation
any more so ignore invalid documents, that's OK, but I'd just
like to check that's it.  (That would make life a bit more fiddly
for the "Tests & validators" in item 4 above, but probably it's
manageable).

Mark

PS apologies if this was all thrashed out in Sexten and the answers
   are all ready to read in Paul's impending Note

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/


More information about the grid mailing list