Followup to Re: utype questions

Thu Jul 2 15:27:43 PDT 2009

Hi Brian -

(Unless we are specifically discussing issues such as ontologies
or UCDs or usage in the DAL interfaces I suggest we move the UTYPE
discussion back to the DM group).

The main point I think you are missing is that UTYPE is not a complete
stand-alone mechanism for modeling or storing complex data (nor is UCD,
VOTable, FITS, etc.).  It only addresses a portion of the problem.
So for example, there is nothing about UTYPE which is specific to
VOTable, or FITS, or a DBMS, or a Java class, or a param file, etc.,
hence we do not need to discuss all these applications in the UTYPE
specification.  Nor does UTYPE tell us how to model complex data.

UTYPE is just a simple mechanism for parameterizing a single data model
up to a certain level of complexity.  Beyond that point one needs more
complex mechanisms to associate related entities, containers to store
the data, container mechanisms to navigate through the stored data,
and so forth.  This is fine - powerful, flexible mechanisms result
from combining simpler elements which are well defined.

> ... As I will attempt to illustrate below, I believe some
> of these limitations are fatal to the purpose of the proposal and
> create (presently) and unworkable spec except for the very simplest
> cases of use.

Well since we have been using UTYPE successfully now for 3-5 years I
hardly think the situation is this serious.  Unless of course these
reinvigorated discussions destroy the simple, useful mechanism we
currently have.

> Please point out the savings of doing it in this manner over a simpler
> approach of importing/redefing elements in XML.

The point of a data model is to define the information content
separately from the implementation or representation.  XML is a
specific implementation technology and only one of many ways to
represent data.  It is primarily a communications format; UTYPE on the
other hand can be used directly in applications to access the content
of a data model, regardless of how it is serialized for transport
between applications.  Think in terms of an object with getters and
setters for the attributes.  The scientist/developer does something
like 'object.get("Target.Class")'.  UTYPE allows us to store the
information content of such an object instance in a runtime class,
in a VOTable, in a DBMS, in a parameter file, a hash table, etc.

 	- Doug

On Thu, 2 Jul 2009, Brian Thomas wrote:

> Hi all,
>
> Well, I suppose I wasn't precise or clear enough in my last email to
> provide substantive criticism of the Utype proposal to which others
> might respond. In fact, re-reading it, it does look like I
> am only trolling for a fight, and this is not my intention (nor
> is it to insult the proposers, so please accept my apology
> if that is how my last email was taken!).
>
> I am, in fact, very much in support of the idea behind the Utype
> proposal that components of one data model may be shared/re-used
> within another. What I have issue with is the present description
> and the limitations which occur because of the choices the authors
> have made. As I will attempt to illustrate below, I believe some
> of these limitations are fatal to the purpose of the proposal and
> create (presently) and unworkable spec except for the very simplest
> cases of use.
>
> So. Allow me to re-try my criticism from another angle, which will
> hopefully be clearer to the rest of the community.
>
> To start with, from reading the documentation (reference [1],[4];
> with additional modifying possible proposals: [2] [3]) I understand the
> following main points about Utypes:
>
> A. Represent data model elements ([1], pg 1)
>
> B. May be serialized in a VOTable as param elements or fields of
> the table ([1], pg 5).
>
> C. Elements may be complex or simple, single values ([1] pg 10)
>
> D. Use the grouping mechanism in VOTable to be used for multiple instances
> of the same model within a single file ([1] pg 10, [4] pg 16).
>
> E. Utype syntax ([1] pg 8) is meant to represent components of a model.
> It is composed of elements model-name, package-name, class-name and
> includes a final position for properties which may be one of
> attribute-name, collection or reference (using XML ID/IDREF mechanism,
> [1] pg 10). Some sort of chaining mechanism exists for naming Utypes
> which occur within other models ([1] bottom of pg 10 and illustrated
> in [2] on pg. 7)
>
> Now on to questions problems which I see in the present document:
>
> 1. Where is the description of the ID/IDREF mechanism for the Utype
> syntax? Much depends on this working properly, including how easily
> the parser will be able to associate various parent model element
> instance to the correct child model instances, the ability
> to correctly order collected child model elements (important in
> some cases, will you infer this from the order of appearance of
> model elements in the file? If so, its not mentioned), and just
> how this might be made to work within the context of a FITS file
> (or HTML table for that matter, as mentioned in [2]) is very unclear.
>
> 2. Why use of the grouping mechanism in VOTable to handle multiple instances
> of a model? This means that the Utype spec is not a complete standalone
> specification. You are relying on an outside structure (grouping) to
> provide meaning to the reconstructed model (ie. which child nodes belong
> to which parents). If I wanted to use Utypes for FITS tables or HTML,
> then what do I replace the grouping mechanism with?
>
> 3. Why is the chaining mechanism important? If the grouping and
> ID/IDREF referencing is working, you don't need it (child will already
> know the parent).
>
> And now some criticism based on the above.
>
> 1. Without working examples for FITS/HTML which use referencing and
> grouping, I can only conclude that this proposal really only amounts
> to a modification of the VOTable specification. Utypes spec relies on
> outside structures to function, namely the grouping which is an element
> of VOTable, and ID/IDREF is an XML mechanism which doesn't exist in FITS.
>
> Please explain how Utypes are to be used with nesting/referencing in FITS and
> provide examples of the ID/IDREF mechanism at work.
>
> 2. Why isn't it reasonable to use an existing XML/XML-schema mechanism of
> import/redefine to allow 'foreign' model elements within the VOTable? This
> is well-tested, XML parsers grok it already, and its standard practice.
> I assume that it must be related to some requirements which are not
> present in [1] (I see that in [2] the Utypes are supposed to allow the
> 'application to treat the Utypes as opaque strings'). All sorts of bad
> things flow from the present specification, here's a short list:
>
>  * You *must* modify the VOTable parser (to understand Utype attribute,
>    the grouping, and the special ID/IDREF mechanism of Utypes). No
>    allowance for passing over.
>
>  * Not possible to create a stand-alone library for grokking Utypes,
>    instead it will have to be tightly integrated with the parser (VOTable,
>    FITs, and whatever is reading your HTML tables, should that be
>    attempted as well).
>
> Please consider a different nesting/referencing mechanism from the present ones.
>
> 3. I wonder if repeated objects must be sent, why use VOTable to serialize
> them? Why not develop a light-weight collection model in XML, and allow
> compression of the contents? The amount of application/semantics work
> looks to be about the same to me.
>
> Furthermore, unless serendipity reigns, you are going to require the re-
> factoring of various data models so that they conform to Utype standard. Are
> you suggesting this is mandatory for all VO accepted models or will their
> be a contributory list of kosher models which work with Utype? Again, this
> adds work to implementation of Utypes successfully.
>
> Even if not mandatory, some models will have to be re-factored (along with
> any existing software based on them), so some work will exist for this.
> Along with the work on the VOTable parser, it seems like significant effort
> will be needed to implement this proposal.
>
> Please point out the savings of doing it in this manner over a simpler
> approach of importing/redefing elements in XML.
>
> ---
>
> Well, that's enough for now (I have tried to stick purely with [1] and
> to main problem points, I have some support for and criticism of [2]
> and [3], but I don't want to run on).
>
> Hopefully I have been precise and clear. I have tried to cite the material
> as well as read it closely. It is possible that I missed something which
> has led to my misunderstanding and I am happy to be corrected on this.
>
> Again, this is written in the spirit of substantive criticism, not a flame
> bait.
>
> Regards,
>
> =brian
>
>
> References
> ----------
>
> [1] Utype: a data model field name convention, v0.3, IVOA Note draft,
>     May 24, 2009
>
> [2] Utype proposals, Norman Grey, 2009?,
>     http://nxg.me.uk/note/utype-proposals
>
> [3] Utypes and URIs, IVOA Note,v1.0, 2007, Norman Grey
>     http://www.ivoa.net/Documents/latest/utype-uri.html
>
> [4] Simple Spectral Access Protocol, v104,
>     http://www.ivoa.net/Documents/latest/SSA
>