Schemas (and utypes)
Arnold Rots
arots at head.cfa.harvard.edu
Tue Jul 21 12:08:24 PDT 2009
Norman,
I see your point and it's not unreasonable. However, it all comes down
how much faith one has in members of the community to do things right.
I suspect I am less trusting and more cynical than you are - maybe
because I'm older and have seen more bad habits? Who knows...
In my judgment the attitude of "I know what I mean and so should you,
so I shouldn't have to spell it all out" is just too prevalent.
It took a while, but at this point people have pretty much accepted
the fact that if they exchange FITS files, they should include a
complete and valid WCS section. Why should we accept less?
Cheers,
- Arnold
Norman Gray wrote:
>
> Arnold, hello.
>
> On 2009 Jul 2, at 16:51, Arnold Rots wrote:
>
> > You touch on one of the central issues that have made me very
> > uncomfortable with Utypes (but I assuem everyone is well aware that I
> > don't like them). See below.
>
> I've taken the liberty of adjusting the subject line here, partly (and
> _very_ importantly) in order to keep this separate from the ongoing
> what-is-a-utype discussion, but also because I believe your points
> touch on a larger and interesting issue to do with schemas in
> general. By 'schemas' here, I mean RDBMS or XML Schemas (in RDF
> 'schema' means something different).
>
> >> This is presuming that ns:target.class isn't one of those utypes that
> >> only makes sense when it's coordinated with a set of other utypes
> >> from
> >> the same model (the goal 1 of utypes, as I understand it). If it
> >> makes sense by itself, then that's excellent, it means that it's been
> >> artfully repurposed here, and an application can reliably/safely
> >> understand this bit of XML without necessarily having heard of the
> >> <whatisit> element before.
> >
> > This is the crux of the matter. A model never consists of a single
> > item. It is usually described by a set of information items (for lack
> > of a better term) that together convey the full meaning that the
> > author intends to convey.
>
> I agree with that to a pretty good approximation. However, a key
> point in your remark is "the full meaning that the author intends to
> convey", to which we can add "the full meaning that the reader intends/
> hopes/aspires to extract", which may be very different.
>
> > The problem with Utypes is that it allows cherry picking of
> > information items with no guarantee that the information is complete,
> > or even makes sense. Consistency, completeness, and uniquenness have
> > been abandoned.
>
> You say "cherry picking", I say "loose coupling". I want to argue
> that utypes, like simple schemas, do indeed "[allow] cherry picking of
> information items with no guarantee that the information is complete,
> or even makes sense", but that this is not a practical problem.
>
> I presume you're thinking of the consistency which the STC schema
> provides, by virtue of its _syntactical_ insistence that all the
> elements of a point's coordinates (for example) are included in a
> message. I recall watching STC discussions on the virtues or vices of
> defaulting versus explicit 'not known' remarks, and as you know I'm
> aware of many of the complications of specifying astronomical
> coordinate systems.
>
> In the more-or-less loosely coupled network environment we're all
> talking about, which is too complicated for one-size-fits-all rules, I
> believe that this level of syntactical specification adds consistency
> at the expense of adding brittleness and unnecessary complication.
> That is because, ultimately, the schema doesn't add much value to the
> message: if there are relevant information items missing from the
> message then it is the consuming application -- and _only_ the
> consuming application -- which is competent to say so, and to default,
> fail, or respond appropriately to the originator. Further, a message
> could pass even the most stringent syntactic validation and still be
> nonsense as far as the application is concerned.
>
> Thus schemas can act as sanity-checks and no more. They don't
> realistically relieve the consuming application from any
> responsibility for error-checking.[1]
>
> What that means in turn is that the _real_ role of schemas and utypes
> is a fairly modest one, concerned simply with indicating which parts
> of a message are to be identified as what, at a syntactic level or not
> much higher (this is the intuition behind "a pointer into a data
> model").
>
> The job of reassembling all these information items into a datamodel
> instance, ontology, java-object, FITS file or whatever you want, is a
> job which happens at a different layer, and it's in that layer that
> appropriate cherry-picking will be accepted, and inappropriate cherry-
> picking rejected, depending on the needs of the application that's
> doing the reassembling. The utype model is therefore a good match to
> a world of heterogeneous applications, data and uses (my suggestions
> are intended to make this good match better, but the utype model is a
> good one nevertheless).
>
> Best wishes,
>
> Norman
>
>
> [1] I wouldn't go as far as to say that schemas are useless. I can
> see that there are some situations where code-generation is useful,
> and they can provide for contract checking ("whose fault is it that
> this message couldn't be parsed?"), but they don't have the semi-
> magical properties that would warrant the amount of interop agony
> sustained when arguing over them.
>
>
> --
> Norman Gray : http://nxg.me.uk
> Dept Physics and Astronomy, University of Leicester, UK
>
--------------------------------------------------------------------------
Arnold H. Rots Chandra X-ray Science Center
Smithsonian Astrophysical Observatory tel: +1 617 496 7701
60 Garden Street, MS 67 fax: +1 617 495 7356
Cambridge, MA 02138 arots at head.cfa.harvard.edu
USA http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------
More information about the semantics
mailing list