vo-dml in votable

Fri Mar 6 17:11:48 CET 2015

Hi All,

I would like to elaborate on the ALTTYPE element and why it was included in
the design.

Let's say that one has a rather simple Source class like this:

Source
--------
name : string
position : SkyPosition

Let's assume SkyPosition is defined in some way in the "stc2" model.

In this simple example, the Source class would be defined in a standard
"src" DM and data providers would be free to extend it in their own models,
for example with a subclass like this:

SDSSSource           -----|> Source
---------------------
description : string

The following vodml-ids (we don't call them utypes anymore) are defined by
the "src" model:

src:Source (the Source type)

src:Source.name
src:Source.position (of type stc2:SkyPosition, defined by stc2)

The "sdss" model adds the following vodml-ids:

sdss:SDSSSource (the SDSSSource type)

sdss:SDSSSource.description

An SDSSSource instance, according to option 4a, would be serialized in
VOTable as follows:

<GROUP>
  <VODML type="sdss:SDSSSource"/>
  <PARAM ... value="M51">
     <VODML role="src:Source.name"/>
  </PARAM>
  <GROUP>
       <VODML role="src:Source.position" type="stc2:SkyPosition"/>
       .... Position serialization here ....
   </GROUP>
   <PARAM ... value="my favorite galaxy">
      <VODML role="sdss:SDSSSource.description"/>
   </PARAM>
</GROUP>

Notice that since SDSSSource extends Source, it inherits all its vodml-ids,
but it shadows the Source's type vodml-id. In the serialization, you don't
find src:Source, although SDSSSource is indeed both an instance of
sdss:SDSSSource and src:Source.

How can a reader find the standard src:Source type in the above snippet?

One approach, somewhat Pythonic I would say, might be considered a form of
duck typing: as long as an instance has the attribute src:Source.position,
a reader might say, I am going to treat it as a src:Source instance. Notice
that this is a stronger version of duck typing (maybe type inference is a
closer concept in terms of programming languages): the uniqueness of the
vodml-ids implies that if an instance has an attribute with the vodml-id
src:Source.position then the instance *must* be a src:Source.position
instance.

ALTTYPE (or any similar element) makes things explicit, so that the reader
does not need to infer the type by the presence of its attributes. In the
above examples one would have:

<GROUP>
  <VODML type="sdss:SDSSSource">
    <ALTTYPE>src:Source</ALTTYPE>
  </VODML>
  ....
</GROUP>

Now, a reader can simply look for src:Source and must not rely on any kind
of type inference. The problem with 4a and ALTTYPE is that readers need to
explicitly look for two different XML elements to get a single piece of
information.

I would then prefer something like option 4b but with an open multiplicity
for the TYPE element, so one would have:

<GROUP>
  <VODML>
     <TYPE>sdss:SDSSSource</TYPE>
     <TYPE>src:Source</TYPE>
  </VODML>
....
</GROUP>

In the above snippet, readers only have to query the XML instance for
GROUPs with the src:Source TYPE in order to get all the instances they are
interested in. I would say that this is more "Java-like", as it is more
explicit and possibly redundant. Again, this is just an analogy: in both
cases, with or without the ability to specify multiple types, the instance
could be said to be "type safe", i.e. a GROUP containing elements with
src:Source.* ROLEs can only belong to a src:Source instance (given the
serialization is valid, i.e. there are no mistakes).

To summarize, "alternate" type is probably a misleading term. What we are
talking about is close to the OO concept of polymorphism, where the same
object may be an instance of multiple types, even without multiple
inheritance support (VODML does not support multiple inheritance as this is
a rather tricky pattern to implement in the relational model, and that
would complicate the data provider's life). Explicitly annotating the
serialization with all the supertypes IDs may make the reader's life easier
and less error prone. However, from a strictly technical point of view,
readers have ways to find the information they seek even without the
supertype annotation.

Notice that in all of the above I am referring to "naive" clients, i.e.
readers that *do not* parse the VODML descriptions, but just assume the
a-priori knowledge of a data model and its utypes. Pardon me, vodml-ids.
Clients that do parse the VODML descriptions do not care that much, as they
can find all the information they need. In my opinion the only reason to
have the supertype annotation is to simplify the reader's life.

A corollary to what I am saying is that I prefer option 4b over 4a because
it allows, at the very least, to keep the multiplicity of the TYPE element
open, which is something you cannot achieve with 4a.

What I don't like about 4b is that it brings forth all the verbosity of
XML. However, I will send a different message regarding my take on the
different options on the table.

There is another interesting use case where alternate types can greatly
simplify the life of the client implementor: abstract types. Let's say that
you have an abstract SkyError type that can be extended by concrete
EllipseError, CircleError, etc.... If the client wants to instantiate the
right corresponding class in its code (let's say a plotting application
that needs to instantiate the correct graphic class for rendering the error
on the sphere), the explicit supertype annotation provides a simple handle:
readers can query for the abstract type's vodml-id and identify the XML
snippet that refers to the error, and then instantiate different classes
according to the concrete type. Without the "alternate type" readers would
need to probe all possible known concrete subtypes. Even worse, since the
subtype might itself be a custom subtype of a standard model, readers have
to fall back to type inference by blindly probing for the attributes of all
the possible known subtypes.

Say that I extend EllipseError and derive MyEllipseError, the XML snippet
would look like this:

<GROUP>
   <VODML>
     <ROLE>stc:SkyPosition.error</ROLE>
     <TYPE>mydm:MyEllipseError"</TYPE>
   </VODML>
   <PARAM ... value="8.5">
     <VODML>
       <ROLE>stc:EllipseError.positionAngle</ROLE>
     </VODM>
   </PARAM>
....
</GROUP>

How does a client aware of the standard stc but not of my extension know
that it has to instantiate an EllipseError rather than a CircleError? It
would need to scan for attributes like stc:CircleError.radius and
stc:EllipseError.positionAngle. With the alternate types, though, one would
have:

<GROUP>
   <VODML>
     <ROLE>stc:SkyPosition.error</ROLE>
     <TYPE>mydm:MyEllipseError"</TYPE>
     <TYPE>stc:EllipseError</TYPE>
     <TYPE>stc:SkyError</TYPE>
   </VODML>
   <PARAM ... value="8.5">
     <VODML>
       <ROLE>stc:EllipseError.positionAngle</ROLE>
     </VODM>
   </PARAM>
....
</GROUP>

With only one added line, things are trivial for the reader now: just query
the XML for GROUPs with a stc:SkyError TYPE, and then instantiate the
correct object (e.g. an ellipse shape) according to the concrete type (e.g.
EllipseError).

Again, I am not arguing that without the supertype annotation things are
impossible for a client, just that with the supertype annotation things are
easier and do not require type inference.

Hope this helps,

Omar.

On Fri, Feb 20, 2015 at 5:26 PM, Mark Taylor <M.B.Taylor at bristol.ac.uk>
wrote:

> Gerard et al.
>
> following other peoples comments and explanations, I'll give
> my thoughts on the VO-DML in VOTable business, though I don't
> claim any great expertise on it.
>
> Regarding the options 4a, 4b and 4c, the question seems to be
> fairly clearly about whether a VODML element should have only
> a single role and type (in which case 4a: <VODML type="t" role="r"/>
> makes sense) or whether there are cases where multiple roles/types
> are required (in which case 4c with ROLE, TYPE children or
> 4b with ROLE, TYPE, ALTTYPE children of VODML).
>
> I don't feel I understand the role/type business, or the expected
> applications of VO-DML, deeply enough to have a firm opinion on
> this question.  I'll note that Markus mentioned polymorphism
> and Gerard mentioned [java] member classnames; both of those
> examples/illustrations make me think about the requirement one
> often has in OO programming to know multiple different
> types/behaviours associated with an object (in java, the hierarchy
> of superclasses and whatever different interfaces implemented)
> not just a single "type".  That leads me to think that multiple
> declared TYPEs might be necessary for some purposes.
> (That wouldn't necessarily mean there's the expectation for
> data providers to always document all types in this way).
> However, perhaps I'm pushing the analogy too far.  Either
> way ALTTYPE seems unattractive.  So I would cautiously back 4c,
> but if it's clear to those who understand this stuff better than
> I do that one ROLE and TYPE per VODML element is always sufficient
> then in the interests of enforcing that restriction, and of tidiness,
> 4a would be better.
>
> Another option would be to use 4a but define the type attribute
> in such a way that it's possible to put multiple type values in it.
> The existing *4a.xsd declares the VODML/@type attribute with
> content type xs:string; by convention you could use a space as
> a separator as long as type values never have spaces (might they?).
> If you know you want multiple type values it would be better to
> have multiple TYPE child elements or if the syntax fits to define
> @type with content type=NMTOKENS.  But as long as it's syntactically
> possible to put multiple types in an attribute it would leave
> an escape clause if it turned out later we'd said 4a when we
> really need multiple TYPEs.
>
> Regarding the disruption of incorporating a new element into
> the VOTable document: it's a pain to have to come up with a new
> version of the VOTable standard, but I doubt that any of these
> options will cause serious compatibility problems in practice
> to VOTable parsers - at least it won't for STIL (hence topcat
> and stilts etc).  If anybody is using parsers rigidly tied to
> the XSD then obviously there might be noticable upgrade issues;
> I don't know if that applies to any of the VOTable parsers in use.
> From this point of view I can't imagine any of the options 4a/4b/4c
> would be significantly better or worse than the others.
>
> As long as we do it right (rather than e.g. starting off with 4a
> in VOTable 1.4 and then needing VOTable 1.5 later to accommodate 4c)
> I think it will be less disruptive to keep this all in the VOTable
> document than to introduce a VODML placeholder in VOTable and point
> to another document.
>
> It's still not clear to me how far this is going to affect my
> day job, but as far as it generates markup of the kind that
> I would want to consume in stil/stilts/topcat I will try to
> cooperate (e.g. with changes to my client code) with prototypers
> or data providers producing VOTables implementing this stuff
> in the interests of testing etc.
>
> Mark
>
>
> On Wed, 11 Feb 2015, Gerard Lemson wrote:
>
> > Dear all
> > It would be nice if we could quickly agree how to annotate VOTable
> elements
> > with metadata pointing to VO-DMl data models.
> > In the votable/vo-dml session in Banff we came to a decision to add a new
> > element to VOTable that would represent this mapping and would take the
> > place of the utype attributes we had assumed to use for that in the
> current
> > mapping document in
> >
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/MappingDMtoVOTable-v1.0.docx
> > .
> > The detailed design of this element was still left open and I sent out an
> > email shortly after the interop in which I made some suggestions, which I
> > thought I had adequately illustrated with examples of the possible XML
> > schema snippets. Maybe this was not clear, so I have updated the
> contents of
> >
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/samples/votable
> > It now contains for three possible designs, labeled 4a, 4b and 4c, full
> > VOTable schemas and three corresponding versions of a VOTable following
> > these schemas and showing the different ways to annotate the same tables.
> >
> > E.g.
> >
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/samples/votable/VOTable-1.3_vodml4a.xsd
> > is a possible new schema and
> >
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/samples/votable/VOTable_Prop4a.xml
> > an annotated VOTable following that design.
> >
> > Note that in the three different proposals only the *content* of the
> > <VODML> elements, as described by the structure of the VODMLAnnotaiton
> > complexType definition, is different. Their association to PARAM, GROUP,
> > PARAMref and FIELDref elements is the same.
> >
> > Please let me know if this is still not clear or comment on the design
> > possibilities.
> >
> > Cheers
> > Gerard
> >
> >
> >
> > On Wed, Oct 15, 2014 at 5:47 PM, Gerard Lemson <gerard.lemson at gmail.com>
> > wrote:
> >
> > > Dear all
> > >
> > > During the VO-DML-in-VOTable discussion at the last interop we decided
> to
> > > go
> > > with (something like) proposal 4 on the wiki at
> > >
> > >
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/samples/votab
> > > le/VOTable_Prop4.xml  Other options can be found in the same folder.
> > > What we now need is to finalize the design for the XML tag(s) to add to
> > > VOTable.
> > > I have created a mockup of an XML schema with a complexType definition
> > > reflecting the new element.
> > > The design used in proposal 4 above is represented by proposal 4a in
> the
> > > schema. It has attributes for role and actual type, and elements for
> > > alternative types. Proposals 4b and 4c vary on this theme. All use a
> > > simpleType definition reflecting the model-prefix+':'+vodml-id pattern
> used
> > > to refer to VO-DML elements. See the mapping document for more detailed
> > > descriptions (taking into account the fact that that document will
> have to
> > > be updated with the choice to be made here).
> > >
> > > Note, I am not making statements on whether these two types must be in
> the
> > > VOTable targetnamespace. Andreas had some thoughts on this that I hope
> he
> > > will send to the list in reply.
> > >
> > > And obviously the final design may differ from these proposals if
> > > necessary,
> > > but let's please converge on an acceptable design soon so Omar and I
> can
> > > update the mapping document.
> > >
> > > Cheers
> > > Gerard
> > >
> > >
> >
>
> --
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/
>

-- 
Omar Laurino
Smithsonian Astrophysical Observatory
Harvard-Smithsonian Center for Astrophysics
100 Acorn Park Dr. R-377 MS-81
02140 Cambridge, MA
(617) 495-7227
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20150306/056dcfad/attachment-0001.html>