datatypes (effects all 3 WDs to some extent)

Tue Apr 1 00:43:56 PDT 2014

Hi everyone,

On Mon, Mar 31, 2014 at 04:16:02PM +0100, Mark Taylor wrote:
> > Ideally, we would have a document that defines datatypes and serialised
> > values. Then ADQL-2.x would refer to that document, VOTable-1.x (x>3) would
> > need to support serialisation of values of those datatypes. TAP-1.x would not
> > have to say as much about datatypes as it does now, so it would get stripped
> > down a bit. Other DAL and DM documents could use the datatypes by reference to
> > a definitive document.
> 
> I'm not sure exactly what you have in mind for the future extensions to
> VOTable here - your use of the term "datatype" suggests that you are
> proposing new primitive values of the datatype attribute alongside

No -- while I cannot speak for Pat, that is not what I have in mind.

The idea is fairly analogous to common programming languages, where
there are some primitive types (which are what REC-VOTable defines),
using which the programmers define their types; in C, say:

typedef struct {
  double lat, lon;
} Point;

typedef struct {
  Point *center;
  double radius;
} Circle;

So, VOTable wouldn't need to change, much as C doesn't change by the
above definitions.

The good news is that we don't have to invent a way to note down
these type definitions -- VO-DML, while it still needs some work to
smooth out the rough edges, is exactly about these definitions.

It also contains rules how to express this in VOTable, and these are
written in a way that annotations can *later* be added transparently,
meaning we can write now

<PARAM name="LAMBDA_MIN" id="LL"/>
<PARAM name="LAMBDA_MAX" id="LT"/>

and have clean and (conceptually) simple interfaces.

Once we decide we need clients to actually understand interfaces and
manipulate complex objects, all that's needed is the inclusion of
groups like

<GROUP utype="Interval">
  <PARAMRef ref="LL" utype="lowerBound"/>
  <PARAMRef ref="LT" utype="upperBound"/>
</GROUP>

(*addition* meaning naive clients would continue to work, as the
remaining document structure hasn't changed) -- and anything with a
sufficiently capable VOTable library could immediately manipulate
interval objects (or circles, or points, or things with reference
systems, or whatever) within both the service and the client without
any custom code, while keeping the stuff on the wire clean.

> "boolean", "int", "float", "double" et al.  I would personally be
> unenthusiastic about that for the two related reasons:
> 
>   (a) the other datatypes are primitive values, and something
>       like interval doesn't seem at the same storage level, and
> 
>   (b) it means that every time we need a new datatype it means a
>       revision of VOTable (and updates of generic VOTable parsing
>       software libraries etc)
> 
> My view of VOTable is infrastructure that should be capable of
> storing syntactically or semantically complex content without itself
> being syntactically or semantically complex.

100% agreed.

> *If* we are going to encode more or less complex data types in
> VOTable I think that the xtype hack is the right way to do it.

While I give you that it would be good to understand what xtype
actually is about, and I do see a role for it in "weird" quasi-atomic
cases (date, in particular), I think for a general type system we are
well advised to do what the programming language community has been
doing at least since the introduction of ALGOL and allow explicit
type declarations rather than daring combinantions of strings as
values and a single other string (the xtype) as annotation.

> future versions of VOTable.  If you (Pat/Markus/list) think there
> is or may be something here which should get addressed in a future
> version of the VOTable standard, I encourage you to add a section
> (though by all means wait until it's clear from list discussions
> what it should contain).

Currently, the VO-DML instance serialization doesn't require any
changes to VOTable.  Within the utypes tiger team, Gerard hinted that
another attribute might help a somewhat more explicit annotation --
basically, allowing annotating both the role and the type of an
entity IIRC --, but as I understand that would go under "nice to
have" rather than essential for the mechanism to work.  Gerard?

Cheers,

          Markus