UType proposals
Norman Gray
norman at astro.gla.ac.uk
Thu Jun 18 14:11:34 PDT 2009
Doug, hello.
On 2009 Jun 15, at 05:42, Douglas Tody wrote:
> On Thu, 11 Jun 2009, Norman Gray wrote:
>
>> http://nxg.me.uk/note/2009/utype-proposals/
>>
>> This document is not intended to be a counterproposal. I believe it
>> is at heart the same proposal as Mireille's, but arrived at from a
>> rather different direction, and so justified in a different way.
>> The principal syntactic difference is that the model-name Utype
>> elements are here references to a namespace, rather than regarded
>> as the namespace themselves.
>
> It seems to me we are covering ground again here that we have been
> over before. I do not see sufficient justification for trying to
> morph UTYPEs into URLs (or XPaths etc.), certainly nothing sufficient
> to change current use of UTYPEs drastically. We have been using
> UTYPEs in IVOA interfaces and implementations quite successfully for
> several years now. While there are details remaining to be specified
> and minor tweaks are possible, no compelling case has been made for
> a major departure from current UTYPE usage.
The problems with the current informal UTYPE usage are:
* The UTYPEs are unversioned -- there seems no provision for v1.1
of a UTYPE
* The UTYPEs are 'dead' strings -- they don't lead to documentation
or further machine-readable information. We live in a networked
world, and it seems perverse to ignore that.
* There is no underlying model for UTYPEs, beyond the vague
assertion that they 'point into a data model'. The current UTYPE
documents go into some detail about the punctuation within a UTYPE,
but don't even approach such basic questions as 'is this a property or
a type?' This means that things like the composite UTYPEs of
Mireille's draft (the ones with the semicolon, which I believe are
eminently defensible) are introduced without any framework for a
discussion of what is actually going on here. Without some such
framework, there is nothing ahead but muddle.
I can't speak for anyone else, but each one of these problems is
compelling to me at least.
> Mireille's draft on the
> other hand is already very close to both documenting current practice
> while clarifying the remaining details.
Although they should of course be informed by implementations,
standards do not exist merely to 'document current practice'.
I take it that a UTYPE standard is intended to be useful for the next
two to four decades of developments on the web, and larger, more
intricate, and more heterogeneous datasets in astronomy. A standard
which merely attempts to document the last few years of SSA
implementations is _not_ practical for the future.
> The key concept which I think is wrong here is the desire to be able
> to have each UTYPE be a self-contained, separable object (which is
> what
> the URL representation provides).
Who is it proposing this? (do you mean object as in OOP?). If you
mean me, I think I must have explained myself very badly indeed.
> This is just not needed in real use
> cases as UTYPEs are only used to tag the individual properties of a
> more complex object. There are multiple such properties, each with
> its own UTYPE tag, for any such object, at least in any real world
> use case. We do not use such object properties (UTYPEs) as separate
> stand-alone objects, rather we use the object these object properties
> collectively refer to. In normal usage multiple such object
> properties
> (UTYPEs) will be needed to represent, understand and use the object.
Indeed. I believe that's exactly what I'm proposing.
As I intended to emphasise, my 'proposal' is in most respects
identical to Mireille's. There are only two differences. Firstly,
I'm aiming to describe what appears to be the underlying model for
UTYPEs, which therefore provides a rationale for them, and answers
questions such as 'what are UTYPEs?' and 'what is the equality
function for UTYPEs?'
Secondly, I'm suggesting that in a case such as
<VOTABLE xmlns:simdb='http://www.ivoa.net/dm/simdb/v1.0#'>
...
<param id='foo' utype='simdb:Simulated.Foo'>
xxx
</param>
</VOTABLE>
an application should act _as if_ the utype were http://www.ivoa.net/dm/simdb/v1.0#Simulated.Foo
. There's nothing there about displaying that complete URL to the
user, or implying some object-oriented approach. Nothing more than
that; and this is effectively identical (apart from minor syntactic
considerations and the underlying rationale) to Mireille's proposal.
> Another issue is that UTYPEs are not merely hidden metadata that
> no one ever needs to look at. Rather they are a primary part of
> the (technical) user interface of the software and protocols we
> use for access to data and other objects. A client application
> for example would typically manipulate data models using UTYPEs
> (or their context-specific aliases) to access the attributes of an
> object instance. It is the *serialization* of the object (be it
> VOTable, FITS, a parameter set, etc.) that we want to hide from
> the developer writing code to manipulate some object.
I think we're on the same page here.
This is the notion -- am I correct? -- that it should be possible to
describe the state of some instance of a data model (which you're
calling an 'object' here, and which might well be an object in the OOP
sense) using a set of key-value pairs, where the keys are UTYPEs and
the values are literals such as strings or numbers. If so, then we're
definitely talking about the same thing, since that was goal 1 in my
'proposal' document.
> The UTYPE is
> the primary construct providing representation-independent access
> to the semantic content of an object instance, and is visible to
> the developer. Hence we do care what it looks like.
Yes, and I was careful to note that "UTypes should be reasonably
readable by a developer."
I'm afraid I just don't see how <http://www.ivoa.net/dm/simdb/v1.0#Simulated.Foo
> is significantly less readable than "simdb:Simulated.Foo",
especially since the way that that would generally appear in a
VOTable, say, would be as "simdb:Simulated.Foo".
Presumably you're not suggesting that something like
"ssa:Char.TimeAxis.Coverage.Location.Value" (randomly picked from
Mireille's document) would appear in a user-facing UI. You mention
'context-specific aliases', but where is this alias to come from? How
is it to be associated with the long UTYPE? What are the bounds of
the 'context'? How is the 'context' named, described or retrieved?
If you mean 'display label' or something like it, then I have
described a principled, already-standardised and immediately
_practical_ mechanism for describing that and whatever else needs to
be associated with the UTYPE now and in the next couple of decades.
That can be deployed _today_. This wheel does not need to be re-
invented.
> While there might be some use in being able to look up some HTML for
> an individual UTYPE, it is much more important to be able to look up
> the documentation for the data model, since in general this is what
> we want to understand. In general it is not that useful to look only
> at an individual object property. Once we can look up a referenced,
> versioned data model there will be many ways we can get documentation
> for individual data model attributes, each with their UTYPE tag.
Yes, and in what I was describing there would be only one HTML
document -- the IVOA REC for a data model. The only extra structure
I'm suggesting in that is that each of the UTYPEs documented in there
is described within an HTML <a name='Simulated.Foo'> element. You put
the UTYPE into your browser and you get the original authoritative
documentation from the appropriate subsection of the REC; when you
want the context and the overall data model description, you just read
the rest of that same document.
> It could be easy for example to auto-generate a URL for an individual
> UTYPE given the UTYPE and the URL of the data model.
Then why not simply require that applications act _as if_ that URL
were the name of the UTYPE?
> 2) the namespace reference and the
> individual object properties should be specified separately so that
> we do not duplicate the class reference in each object property
> (UTYPE), which aside from being unnecessarily verbose would make
> it much more difficult to ensure object integrity.
I don't see how
<VOTABLE xmlns:simdb='http://www.ivoa.net/dm/simdb/v1.0#'>
...
<param id='foo' utype='simdb:Simulated.Foo'>xxx</param>
<param utype='simdb.Simulated.Bar'>yyy</param>
....
</VOTABLE>
...is unnecessarily verbose or undermines object integrity. An
application is supposed to interpret this as a set of key-value pairs
which allow it to reconstruct an instance of a data model object (this
is correct, isn't it?). That is, I believe this should imply a table
in memory which will be the equivalent of
http://www.ivoa.net/dm/simdb/v1.0#Simulated.Foo xxx
http://www.ivoa.net/dm/simdb/v1.0#Simulated.Bar yyy
...
If this is too verbose, and an application wants to manage this table
differently, in order to save bytes of memory or something, then that
would be fine, as long as its effect is equivalent to this. I think
I'm missing the point at which this verbosity is a problem.
If I have explained this 'proposal' poorly, I'm sorry, and would
welcome suggestions for improvement. I emphasise that I believe the
differences from Mireille's proposal are significant but slight, and
make UTYPEs ready for the future.
Best wishes,
Norman
--
Norman Gray : http://nxg.me.uk
Dept Physics and Astronomy, University of Leicester
More information about the dm
mailing list