utype for STC region in SIAP query response

Thu Dec 22 08:04:22 PST 2011

Doug, hello.

On 2011 Dec 20, at 01:35, Douglas Tody wrote:

> I hope the holidays are treating you well.

Here's hoping; and to you, too.  I'm finishing up for the New Year break this afternoon, so I'll have to bow out of this discussion with this message.

I don't really have a horse in this race, so I participate in the discussion with some diffidence (possibly more diffidence than may be obvious...).  My remarks below are rather general, therefore, and are intended to express my general uneasiness with the overall shape of the DM proposals.  Sans horse, I can't reasonably go much further than that.

>> and suppose that the 'foo' namespace was standardised after the
>> application itself was written, or since it was last updated.  Should
>> that application treat this as a footprint?
> 
> In the first place lets not forget that the problem we need to solve
> here is fairly limited.  VO only requires half a dozen or so interfaces
> for the major classes of data (at least for observation-based data).  We
> have nearly all of them now and they are already fairly consistent.  We
> can eventually make them fully consistent in terms of the standard
> metadata and submodels (Char etc.) included in each object class.  It
> does not have to be all that complicated and the mechanism should not be
> allowed to get too complicated, at least not for the standard core
> metadata.

The argument from practicality is a strong one, and comes down to judgements about the relative weight of different downsides.

Having said that, I'm looking at the draft UTYPE specification at <http://www.ivoa.net/internal/IVOA/Utypes/WD-Utypes-0.4-20091107.pdf> and can see UML (therefore object oriented notions of inheritance), XML Schemas, some quasi-XPath, a partial conflation of data modelling and interface design, a novel conceptual and syntactical framework for composing names which are derived from UML models, minus an associated semantics, but plus a sketched abbreviation syntax.  I broadly understand why each of those things is there, historically, but the end result is _scarily_ complicated.

If the things I've suggested in the past sound half as complicated as the UTYPE draft, then I've been explaining them very badly.

Perhaps this doesn't matter.  If everyone in the IVOA knows roughly what all the UTYPEs mean, and uses them by spotting substrings in @utype attributes, then this may end up stable in the long term.  A logically similar mechanism long worked for FITS, though FITS was hampered by with shorter names, and a more ad hoc approach to generating and documenting them.  The informality does however limit what you can do with the terms in the future, to roughly what you can do with them now.

The other thing is...

> A data model namespace, at least within the DAL interfaces,
> corresponds to an object class...

> Usually a new class extends an existing one...

> ...but it extends an existing well known class

> In theory we could instead compose an object instance by instantiating
> external submodels like Char, Photometry, STC, Target, etc.

The discussion round UTYPEs is often in terms of object-oriented design (for example in the repeated implicit and explicit reference to UML).  But interface design is very different from data modelling: one is about the entry points to a library -- meaning predictability of behaviour -- and the other is about the static structure of data -- meaning predictability and interchangeability of meaning.

This is not angels-on-pins territory.  There is clearly some overlap between the two activities, since an interface implies something about the structure of the domain, and a data model suggests a natural way to get access to the data, but the two activities are very different.

A Java interface definition (object modelling) makes some guarantees about the input and output to functions, and what functions are available, and (as it turns out) can help robustness by keeping some bits of information inaccessible; but it has next to nothing to say about data or meaning. An XML Schema definition (data modelling) is pretty obviously about shared meaning, but the only link with object-orientiation is that both frameworks happen to use the word 'inheritance' (nothing I say should imply that I think that XSchema is a good data modelling language).  The fact that there are libraries which generate Java classes from XSchemas tends to blur the distinction, but (to me) just explains why these generated libraries are less use than one might expect.

In this light, the appearance of object-modelling terminology and techniques in the IVOA's _data_ modelling group is slightly puzzling.

----

But that's quite enough from me (more than enough, I'm sure I heard someone mutter...).  Have a good break, everyone who's taking one, and have a good New Year when it comes.

All the best,

Norman

-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK