ivoa: DM type system

Wed Apr 19 23:02:37 CEST 2017

HI Markus

On Wed, Apr 19, 2017 at 3:56 AM, Markus Demleitner <
msdemlei at ari.uni-heidelberg.de> wrote:

> Hi Gerard,
>
> On Thu, Apr 13, 2017 at 12:19:48PM -0400, Gerard Lemson wrote:
> > Details to be worked out, but we may not be able to keep the separation
> > between a VOTable and separate vo-dml-maping schema.
>
> TL;DR: My feeling has always been that that's not a worthy goal.
>
Fine with me and probably necessary regarding the most obvious tweak that
should make you more content.

>
> > > Markus Demleitner, Thursday, April 13, 2017 5:18 AM
> > > But you're right, my primary concern here is LITERAL within VOTable,
> > > and thus a mapping issue.
> >
> > And especially a serialization issue I think. For indeed VO-DML says
> > NOTHING about how to serialize instances.
>
> So, what's the intended use of dmtype in LITERAL then?  In the end, a
> VOTable library needs to come up with an implementation-specific
> representation of
>
> In the new annotation syntax, the value of a Role (ATTRIBUTE, REFERWNCE or
COMPOSITION) has a "VODMLInstance", or possibly more than one.
VODMLInstance is the base type of specialized things like COLUMN, CONSTANT
and also LITERAL. 'dmtype' is edefined on VODMLInstance and therefore
inherited by all. Sometimes for the good, namely when explicitly casting an
value to a type different form the declared datatype of the role. Otherwise
redundant IF one has knowledge of the model. So  IF we made dmtype optional
on VODMLInstance, and in the mapping spec mandate it MUST be used only when
explicitly casting, then there would be less option for confusion.
IF that is we add votable datatype-s to LITERAL. I think that is probably
the best solution. We don't need it to be a full PARAM, but
datatype,arraysize,unit may have to be added. Not utype () or ucd, and is
xtype necessary here?

>   <LITERAL dmtype="real">23.49</LITERAL>
>
> For that, it needs to know how *something* (VO-DML?  The mapping
> spec?  VOTable?  Yet something else?) says that reals, points,
> rationals, timestamps, or whatever are written as children of LITERAL.

> This problem surfaces exactly because (at least) with LITERAL, the
> new mapping proposal bleeds VO-DML into the serialisation.  A mild
> taste of the resulting horrors we can witness in a parallel fork of
> this thread.  Please, let's not *again* start a discussion if we want
> timezones in an "ISO" date string, if the Z is mandatory or if the T
> can be a blank.  That's just happened a year ago over in DAL.
>
> Are you saying that ALL timestamp serializations in the VO are to follow
the DALI spec?
Does that have repercussions for STC etc
In any case, I do not think the serializations string is the main issue.

> We have a set of serialisations in VOTable.  It may not be pretty,
> but it's there, and building another, perhaps prettier one, will not
> make it go away.  And it's much better to have to implement just one
> mildly ugly thing than to have to implement a mildly ugly *and* a
> mildly pretty thing *and* then map between the two.
>
> > > in DALI, sect. 3.3.3, for these.
> >
> > >
> > Does this mean you're  favoring passing such choices off to the protocols
> > then?
>
> I personally would have preferred to have all VOTable serialisation
> aspects treated in the VOTable document itself.  But this is not an
> ideal world, and so in practice, part of DALI now specifies a
> universal VOTable extension.
>
> At least there's been an informal agreement that we'd keep xtype
> definitions in DALI, so here's to hoping that reading
> VOTable+DALI+the mapping document will be enough to write a general
> VOTable parser in the future.
>
> So, no, DALI exists exactly to keep this kind of thing outside of
> concrete protocols.  Think of that part as an exclave of the VOTable
> spec.
>
> > But if this is allowed, could this not extend to the VO-DML literal
> > serializations as well?
>
> That would automatically happen if you used PARAMs instead of
> LITERALs.  Which is essentially what I'm proposing.
>

> > > But if the price for this is that people, within one VOTable, have to
> > > recognise timestamps in LITERALs by seeing it's
> > > vodml-type="ivoa:datetime" and using one literal parser, while having
> > > to check xtype and use a different literal parser when it's in PARAM
> > makes me cringe.
> >
> > > Plus, it won't stop with datetimes.
> >
> > Sure. So are you arguing we should rely on xtype as a kind of
> free-for-all
> > label plus a protocol-based serialization prescription? For datatype,
> even
> > with arraysize will in general not be sufficient.
>
> No, I argue that when you introduce types with a serialisation
> libraries are expected to understand and map to language-native
> objects, you have to change either VOTable itself or, sigh, DALI.
>
> And that such a thing certainly cannot happen in a data model, and
> preferably not in a mapping document, either.
>
> > > > Anyway, note also that VOTable mappers will have the option of using
> > > > a CONSTANT (i.e. the VO-DML-mapping equivalent of a PARAMref), i.e.
> > > > create a VOTable-typed PARAM somewhere and refer to it. I would
> > > > definitely NOT propose this as a solution though.
> >
> > > Since you mention it: Why not?  Sure, an extra reference is involved,
> > > which is always bad, but as far as I can see there's nothing you can
> > > do with LITERAL that you can't do with CONSTANT, and one feature less
> is
> > > always a big win.
> >
> > You'd have to always add PARAMs outside the VODML spec to be linked to
> from
> > within.
> >
> > So in spec it may be one element less, but in instances it'd be one extra
> > element for every literal value.
>
> Me, I'm torn in that point a bit in the CONSTANT (i.e.,
> PARAM+reference in VODML) vs. LITERAL (i.e., PARAM in VODML) question.
>
> On the one hand, I really like the idea of having VO-DML be
> annotation exclusively; so, the "main" part of the VOTable would
> contain all the data and metadata, and the VODML part just pointers
> there explaining to a computer how all these elements fit together.
> The advantage would be that when the VODML part gets lost, all
> information would still be there, just not machine-interpretable any
> more.
>
> On the other hand, having to write and parse a both a reference *and*
> a PARAM for every string is incredibly ugly and verbose.
>
> Be that as it may, let me conclude with an ardent plea against the
> separation of VO-DML annotation from "core" VOTable.
>
> From my perspective, the only reason we're doing VO-DML is to to
> formally define *annotation*.  What we annotate is, in a first step,
> VOTable.  Any attempt to try to hide this fact and make VO-DML
> annoation somehow detached from the eventual purpose of the exercise
> is only going to complicate matters.
>
> I agree. Though imho annotating TAP_SCHEMA is a much more interesting use
case.
Also that can still use VOTable though, as soon as we have a simple way to
represent TAP_SCHEMA therein.

> One *might* perhaps plan for technical trouble, such as annotation
> getting lost, but other than that, VO-DML annotation is and must be
> an integral part of VOTable, and it must be specified in the VOTable
> document.  If that doesn't happen, everyone will keep thinking it's
> somehow optional to declare frames and epochs with positions, and to
> group values and errors.  Which it isn't, if we want computers to be
> able to meaningfully with VO data now that we're leaving the paradise
> of "everything relevant is close enough to epoch J2000 in ICRS".
>
> Hence, there's in my view no real reason to but rather a strong
> reason to not shun PARAMs or other "classic" VOTable elements in
> VODML annotation.
>

> I'd much prefer, too, if PARAMrefs and FIELDrefs could make a return
> there.  Later implementors will not give a damn about the history of
> the annotation.  They'll just curse us for having multiple elements
> (PARAMref and CONSTANT, FIELDref and COLUMN) that do the same thing,
> just in different places.
>
> I am actually against reusing these VOTable types in the VODML annotation
as their content is quite different.
Wrt CONSTANT and ParamRef, and COLUMN and FieldRef, apart from dmtype, the
vodml mapping versions inherit OPTIONMAPPING from VODMLPrimitive. The
votable elements have ucd and utype which we explicitly have no place in a
VODML annotation.
Param carries along lots of elements we do not need, but are required by
the VOTable spec. Hence LITERAL (with few votable attributes) is
sufficient.

I still think that anyone who starts coding against the VODML annotation
part of the VOTable will write completely new code and should not be scared
that there are new names on elements that may superficially seem similar.

I think one might reverse the argument, expecting(well, hoping?) that
future implementors will not have to deal with GROUPs and their paramrefs
and fieldrefs anymore, because VODML will replace their use.

So, so far my conclusion is that
1) We SHOULD add VOTable features identifying the datatype of LITERALs in a
VODML independent manner. The consequence of this is that we likely have to
add the VODML annoatation elements inside the VOTable schema.
2) We MAY wish make the dmtype attribute optional on all instances.

Cheers
Gerard

           -- Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20170419/3abd1a3a/attachment.html>