DataLinks and ID data types.

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu Apr 2 09:24:30 CEST 2020


Hi Tom,

On Wed, Apr 01, 2020 at 01:53:27PM +0000, Mcglynn, Thomas A. (GSFC-6601) wrote:
> One of the ways to use DataLink is to have a column X in a primary
> (results) RESOURCE of a VOTable which provides a DataLink ID to be
> used in a subsequent Resource Descriptor.  At the HEASARC we've
> recently had some discussion about what, if any, restrictions there
> are on the types of these.
> 
[...]
> Two questions have arisen in our usage at the HEASARC:
> 
> 1. Do both the FIELD and the PARAM need to be character arrays, or
> can they be ints, longs, ...?

If I had a time machine, I'd go back to datalink 1.0 RFC and put in
type information.  As things are, it seems the only concrete
restriction is that content_length must be a long.

That's unfortunate, not the least because if we allow ids different
types, then we also have to admit these types in the ID column of the
links table, and I'd rather not have that polymorphic.

Given the types are unspecified, what do we do?  I'd hope we could
still fiddle in types without *materially* breaking the spec (in the
sense of: new major version).  Well, we'd be breaking HEASARC
practice, but that breaks with current pyvo at least -- have you
tried with TOPCAT and Aladin?

If we go for fixing types, an interesting question would be whether
we restrict ID to char[] (my preference) or would also allow
unicodeChar[]...

> 2. Do the types of the fields need to be the same in the two
> locations?  At the HEASARC we frequently have the FIELD as an
> integer, but it is given in the PARAM as a string (CHAR*)

Again, we've failed to make that explicit, and the use of @ref
Datalink makes (from PARAM to FIELD) isn't actually forseen as such
in VOTable for all I can make out (at least VOTable 1.4, sect. 3.2
doesn't say anything that might constrain such a relationship).

However, here I'm really certain that implicit type conversions from
table row to parameter value are evil, not the least because the are
not canonical (think of floats, and with them points, etc. Of course,
making these IDs is a bad idea in the first place; even for integers
in VOTable Tabledata, however, +1 and 1 are equivalent).

Hence, I'd say that *if* we allow in-table IDs to be non-strings, we
must allow that for the ID parameter as well.

And depending on what way we decide, we probably ought to have an erratum on 
either Datalink 1.0 (saying ID and PARAM[name='ID'] have to be
strings) or on VOTable (saying that PARAM/@ref to a FIELD or PARAM is
only allowed if both ends of the relationship agree in type,
arraysize, and xtype).

        -- Markus


More information about the dal mailing list