UType proposals

Sun Jun 14 21:42:24 PDT 2009

On Thu, 11 Jun 2009, Norman Gray wrote:

> For your delectation and delight, I present a concrete version of the UType 
> proposal I briefly discussed with some people in Strasbourg two weeks ago.
>
>   http://nxg.me.uk/note/2009/utype-proposals/
>
> This document is not intended to be a counterproposal. I believe it is at 
> heart the same proposal as Mireille's, but arrived at from a rather different 
> direction, and so justified in a different way. The principal syntactic 
> difference is that the model-name Utype elements are here references to a 
> namespace, rather than regarded as the namespace themselves.

It seems to me we are covering ground again here that we have been
over before.  I do not see sufficient justification for trying to
morph UTYPEs into URLs (or XPaths etc.), certainly nothing sufficient
to change current use of UTYPEs drastically.  We have been using
UTYPEs in IVOA interfaces and implementations quite successfully for
several years now.  While there are details remaining to be specified
and minor tweaks are possible, no compelling case has been made for
a major departure from current UTYPE usage.  Mireille's draft on the
other hand is already very close to both documenting current practice
while clarifying the remaining details.

The key concept which I think is wrong here is the desire to be able
to have each UTYPE be a self-contained, separable object (which is what
the URL representation provides).  This is just not needed in real use
cases as UTYPEs are only used to tag the individual properties of a
more complex object.  There are multiple such properties, each with
its own UTYPE tag, for any such object, at least in any real world
use case.  We do not use such object properties (UTYPEs) as separate
stand-alone objects, rather we use the object these object properties
collectively refer to.  In normal usage multiple such object properties
(UTYPEs) will be needed to represent, understand and use the object.
If all one wants is a simple stand-alone value some other construct
may be used such as UCD.

Since with UTYPEs we are dealing with data model instances with
multiple properties, I would much rather have the URL refer to the
object we are dealing with, than the individual object property.
Hence the URL refers to the entire class definition, defining a
context (name space) for the associated object properties (e.g.,
"ssa:Target.Name").  Versioning is for the entire object instance.
This is not only simpler, avoiding duplicating the same URL in each
UTYPE, it helps ensure object integrity as the mechanism requires
that all UTYPEs sharing the same namespace relate to the same object
instance.

Another issue is that UTYPEs are not merely hidden metadata that
no one ever needs to look at.  Rather they are a primary part of
the (technical) user interface of the software and protocols we
use for access to data and other objects.  A client application
for example would typically manipulate data models using UTYPEs
(or their context-specific aliases) to access the attributes of an
object instance.  It is the *serialization* of the object (be it
VOTable, FITS, a parameter set, etc.)  that we want to hide from
the developer writing code to manipulate some object.  The UTYPE is
the primary construct providing representation-independent access
to the semantic content of an object instance, and is visible to
the developer.  Hence we do care what it looks like.

While there might be some use in being able to look up some HTML for
an individual UTYPE, it is much more important to be able to look up
the documentation for the data model, since in general this is what
we want to understand.  In general it is not that useful to look only
at an individual object property.  Once we can look up a referenced,
versioned data model there will be many ways we can get documentation
for individual data model attributes, each with their UTYPE tag.
It could be easy for example to auto-generate a URL for an individual
UTYPE given the UTYPE and the URL of the data model.

To summarize: 1) the URL should refer to the entire object (class
to be precise) and not to each individual object property (hence
the URL defines a namespace); 2) the namespace reference and the
individual object properties should be specified separately so that
we do not duplicate the class reference in each object property
(UTYPE), which aside from being unnecessarily verbose would make
it much more difficult to ensure object integrity.  How we define a
namespace reference will in general depend upon the context (e.g.,
xmlns in VOTable, but other mechanisms are possible).

 	- Doug