ivoa: DM type system

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Apr 19 09:56:38 CEST 2017


Hi Gerard,

On Thu, Apr 13, 2017 at 12:19:48PM -0400, Gerard Lemson wrote:
> Details to be worked out, but we may not be able to keep the separation
> between a VOTable and separate vo-dml-maping schema.

TL;DR: My feeling has always been that that's not a worthy goal.

> > Markus Demleitner, Thursday, April 13, 2017 5:18 AM
> > But you're right, my primary concern here is LITERAL within VOTable,
> > and thus a mapping issue.
> 
> And especially a serialization issue I think. For indeed VO-DML says
> NOTHING about how to serialize instances.

So, what's the intended use of dmtype in LITERAL then?  In the end, a
VOTable library needs to come up with an implementation-specific
representation of

  <LITERAL dmtype="real">23.49</LITERAL>

For that, it needs to know how *something* (VO-DML?  The mapping
spec?  VOTable?  Yet something else?) says that reals, points,
rationals, timestamps, or whatever are written as children of LITERAL.

This problem surfaces exactly because (at least) with LITERAL, the
new mapping proposal bleeds VO-DML into the serialisation.  A mild
taste of the resulting horrors we can witness in a parallel fork of
this thread.  Please, let's not *again* start a discussion if we want
timezones in an "ISO" date string, if the Z is mandatory or if the T
can be a blank.  That's just happened a year ago over in DAL.

We have a set of serialisations in VOTable.  It may not be pretty,
but it's there, and building another, perhaps prettier one, will not
make it go away.  And it's much better to have to implement just one
mildly ugly thing than to have to implement a mildly ugly *and* a
mildly pretty thing *and* then map between the two.

> > in DALI, sect. 3.3.3, for these.
> 
> >
> Does this mean you're  favoring passing such choices off to the protocols
> then?

I personally would have preferred to have all VOTable serialisation
aspects treated in the VOTable document itself.  But this is not an
ideal world, and so in practice, part of DALI now specifies a
universal VOTable extension.  

At least there's been an informal agreement that we'd keep xtype
definitions in DALI, so here's to hoping that reading
VOTable+DALI+the mapping document will be enough to write a general
VOTable parser in the future.

So, no, DALI exists exactly to keep this kind of thing outside of
concrete protocols.  Think of that part as an exclave of the VOTable
spec.

> But if this is allowed, could this not extend to the VO-DML literal
> serializations as well?

That would automatically happen if you used PARAMs instead of
LITERALs.  Which is essentially what I'm proposing.

> > But if the price for this is that people, within one VOTable, have to
> > recognise timestamps in LITERALs by seeing it's
> > vodml-type="ivoa:datetime" and using one literal parser, while having
> > to check xtype and use a different literal parser when it's in PARAM
> makes me cringe.
> 
> > Plus, it won't stop with datetimes.
>
> Sure. So are you arguing we should rely on xtype as a kind of free-for-all
> label plus a protocol-based serialization prescription? For datatype, even
> with arraysize will in general not be sufficient.

No, I argue that when you introduce types with a serialisation
libraries are expected to understand and map to language-native
objects, you have to change either VOTable itself or, sigh, DALI.  

And that such a thing certainly cannot happen in a data model, and
preferably not in a mapping document, either.

> > > Anyway, note also that VOTable mappers will have the option of using
> > > a CONSTANT (i.e. the VO-DML-mapping equivalent of a PARAMref), i.e.
> > > create a VOTable-typed PARAM somewhere and refer to it. I would
> > > definitely NOT propose this as a solution though.
> 
> > Since you mention it: Why not?  Sure, an extra reference is involved,
> > which is always bad, but as far as I can see there's nothing you can
> > do with LITERAL that you can't do with CONSTANT, and one feature less is
> > always a big win.
> 
> You'd have to always add PARAMs outside the VODML spec to be linked to from
> within.
> 
> So in spec it may be one element less, but in instances it'd be one extra
> element for every literal value.

Me, I'm torn in that point a bit in the CONSTANT (i.e.,
PARAM+reference in VODML) vs. LITERAL (i.e., PARAM in VODML) question.

On the one hand, I really like the idea of having VO-DML be
annotation exclusively; so, the "main" part of the VOTable would
contain all the data and metadata, and the VODML part just pointers
there explaining to a computer how all these elements fit together.
The advantage would be that when the VODML part gets lost, all
information would still be there, just not machine-interpretable any
more.

On the other hand, having to write and parse a both a reference *and*
a PARAM for every string is incredibly ugly and verbose.

Be that as it may, let me conclude with an ardent plea against the
separation of VO-DML annotation from "core" VOTable.

>From my perspective, the only reason we're doing VO-DML is to to
formally define *annotation*.  What we annotate is, in a first step,
VOTable.  Any attempt to try to hide this fact and make VO-DML
annoation somehow detached from the eventual purpose of the exercise
is only going to complicate matters.

One *might* perhaps plan for technical trouble, such as annotation
getting lost, but other than that, VO-DML annotation is and must be
an integral part of VOTable, and it must be specified in the VOTable
document.  If that doesn't happen, everyone will keep thinking it's
somehow optional to declare frames and epochs with positions, and to
group values and errors.  Which it isn't, if we want computers to be
able to meaningfully with VO data now that we're leaving the paradise
of "everything relevant is close enough to epoch J2000 in ICRS".

Hence, there's in my view no real reason to but rather a strong
reason to not shun PARAMs or other "classic" VOTable elements in
VODML annotation.

I'd much prefer, too, if PARAMrefs and FIELDrefs could make a return
there.  Later implementors will not give a damn about the history of
the annotation.  They'll just curse us for having multiple elements
(PARAMref and CONSTANT, FIELDref and COLUMN) that do the same thing,
just in different places.

         -- Markus


More information about the dm mailing list