VOEvent session

Douglas Tody dtody at nrao.edu
Wed Dec 15 17:11:25 PST 2010


On Wed, 15 Dec 2010, Rob Seaman wrote:
>> Note that 'utype' is never used in the v2.0 spec, and 'data model' is used only in describing a <Reference>, I'm not sure where your argument is going with that.
>
> It's interesting to see where the VO "vision" is going.  In 2005, "XML Schema" was the answer to all questions.  Rather, now folks suggest that actually building a Schema as complex as the underlying data is unacceptable.  Instead we are supposed to have a simple Schema - perhaps just a sequence of Params - but with utype attributes that map (not clear if this is either surjective or injective) onto some ideal data model.  The complexity is now all in the utypes, and validating against the Schema doesn't actually tell you anything about the contents of the file.

The basic question is whether the science or the technology comes
first.  In the former case you pick some useful/relevant technology
(a matter of judgment of course) and do something practical to solve
some real-world problem.  In the latter case you pick SOAP, or complex
XML with schema verification, or uber-REST, etc. and several years
later find that there are unforseen issues.  (So far VOEvent has done
pretty well despite this failure mode).

There is no "ideal data model" - all data models are abstractions,
and are imperfect approximations.  Otherwise they become too complex
to be workable or generally useful.  Hence we have to resist the
tempation to try to be too precise.  This is hard as the whole point
of a data model is to provide some technical/physical rigor.

UTYPEs are actually trivial, and have no inherent meaning, unlike UCDs.
All they are is a unique reference to a single field/element of an
abstract data model, expressed as a fixed string.  They allow a more
complex model to be "parameterized" as a set of object attributes (up
to a point beyond which relational or more explicitly hierarchical
techniques are required - but by then the model may already be
too complex).  This allows the attributes of the data model to be
expressed in simple data structures such as parameter sets, table
columns, etc.  What complexity there is, is in the data model itself,
which has to be understood (either by a human or by code); in general
this cannot be adequately expressed in a schema or other lower level,
non-object mechanism.

The key thing here is that data model + UTYPE addresses the real
problem of modeling scientific data; we (astronomy) can make it
work ourselves which is an enormous advantage over any more general
mechanism even if it is crude in many ways.  Something like XML schema
cannot ever do this as it is lower level and more general; this has
advantages in other ways but does not directly address our problem.

 	- Doug


More information about the voevent mailing list