Proposed erratum to clarify arraysize="1"

Mark Taylor m.b.taylor at bristol.ac.uk
Sat Feb 10 01:13:34 CET 2018


Tom, Markus, apps,

On Fri, 9 Feb 2018, Markus Demleitner wrote:

> I would, however, argue that in order to have a clear standard, we
> should upgrade the shoulds to musts, i.e., for sect. 2.2:
>
>   Note: the arraysize attribute must be present if, and only if,
>   each table cell for the FIELD is intended to be treated as an
>   array. Hence, arraysize="1" must not be used except in the unusual
>   case that the table cells contain single values that are
>   intended to be understood as single-value arrays.
>
> And for sect. 4.1:
>
>   The arraysize attribute must be omitted unless the corresponding
>   table cell contents is intended to be understood as an array.

This seems to me to be going beyond what's reasonable for an Erratum.
Software and existing VOTable documents that were written in good
faith in accordance with (at least a plausible reading of) the standard
would go from being legal to illegal if these MUSTs were introduced.

> Having "should" here is, I think, hard to interpret in terms of RFC
> 2119:
>
>   SHOULD NOT   This phrase, or the phrase "NOT RECOMMENDED" mean that
>   there may exist valid reasons in particular circumstances when the
>   particular behavior is acceptable or even useful, but the full
>   implications should be understood and the case carefully weighed
>   before implementing any behavior described with this label.

Looks applicable to me.  The particular circumstances apply where
software was implemented or documents were generated before this
erratum was introduced.  That doesn't argue against weighing the
case *before* implementing any behavior described as SHOULD
(erm, as long as this isn't applied retrospectively...)

> Applied to arraysize="1" this would mean that VOTable parsers would
> probably need to let users override the behaviour that arraysize="1"
> are arrays, and I'd argue that's not something we should impose on
> implementors.

This erratum already (of necessity) imposes the suggestion to
"exercise flexibility" on implementors:

   "However, clients may still wish to exercise flexibility
    there since not all services are compliant with these
    clarified semantics."

In practice, as this advice acknowledges, many clients will have
to cope with both corrected and uncorrected behaviours.
Otherwise, it's very likely things will break.  I certainly
don't plan to 'fix' STIL/STILTS/TOPCAT so that all arraysize="1"
values it encounters will turn into 1-element arrays.
I honestly don't know how common it is for services to emit
arraysize="1" for scalars, but in those cases where that happens,
changing TOPCAT's behaviour in accordance with this erratum would
cause all kinds of functionality such as plotting, statistics,
algebraic selections etc to just stop working for columns that
fall foul of the corrected interpretation.

If you take seriously the sentiment:

   "Although [arraysize="1"] is legal, it is not likely that the
    provider meant for the values to be interpreted as an array."

then it makes sense to view software that interprets arraysize="1"
as a 1-element array to be doing the wrong thing, which would suggest
that the behaviour of that software, rather than the content of the
standard, ought to be changed.  Does anybody know which software
does in fact interpret arraysize="1" as a 1-element array?

Incidentally, the practice of ignoring a unit-valued arraysize
corresponds to (and can be considered inherited from) what is mandated
by the (more carefully and explicitly written) FITS standard.
FITS v3.0 sec 7.3.1 says, describing the BINTABLE TFORMn header:

   "The repeat count r is the ASCII representation of a non-negative
    integer specifying the number of elements in field n. The default
    value of r is 1; the repeat count need not be present if it has
    the default value."

>   In any case, experience show that services are easy to update where
>   they still use arraysize="1".  It is reasonable to expect that they
>   will be updated by the time legacy clients with erroneous
>   arraysize="1" behaviour are corrected.

That is a bit optimistic - they may be "easy to update" in the
sense that the required source code changes are straightforward,
but modifying and redeploying legacy services relying on old
libraries, or replacing a static collection of VOTable files,
may not be easy to do.  And in any case, why assume that services
will get updated faster than clients?

I'm not totally against this change.  If we were drafting VOTable
now, for sure I'd say (in defiance of FITS) that the presence of
an arraysize attribute must be taken to indicate an array value.
But to invalidate existing legal services and documents with an
erratum looks questionable to me.  I wouldn't oppose a SHOULD here
which could be strengthened to a MUST in future VOTable revisions.
(Though I feel that even a SHOULD might be an abuse of the Erratum
process; it's not really clear that this is a "clarification" of
the original intention rather than finally deciding which side of
a long-standing fence to come down on).

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/


More information about the apps mailing list