VOUnits update: Empty/missing units
Mark Taylor
m.b.taylor at bristol.ac.uk
Fri Dec 3 11:19:24 CET 2021
On Tue, 30 Nov 2021, Markus Demleitner wrote:
> Dear Semantics, dear Apps,
>
> Norman and I are currently preparing a maintenance release of VOUnits,
> <https://github.com/ivoa-std/VOUnits>.
>
> As part of that, we figured we should finally tackle the question of
> null-ish units. There are at least two aspects to the problem:
>
> (a) The unit attribute of dimensionless quantities
>
> That's what I'd call values that conceptually would have a unit, but
> that unit works out to one. Think of ratios, or a cofficient of
> performance, or many similar things. I think everyone uses unit=""
> for these right now, but that's a problem because the VOUnits grammar
> (like all other unit grammars discussed in the spec) does not admit
> the empty string. Hence, if we think that is what should be done, we
> should probably fix the grammar accordingly.
I don't think it's a problem that the VOUnit syntax does not
permit the empty string. In most contexts (e.g. TAP_SCHEMA.columns
UCD column or ucd attribute in VOSI-tables) you can just omit to
supply a unit (null value). IMHO reasonable software interpreting unit
strings will/should anyway treat an empty string as "don't try to make
sense of this as a unit" (without necessarily taking a stand on whether
it's dimensionless or unsuitable for units or the author hasn't
thought about units) rather than attempting to parse it against
a given grammar.
> That's a special case, though, and when we're special-casing anyway,
> there's something to be said for making "1" that special case. There
> may be profound reasons to prefer that, but a very pragmatic one
> would be that unit="1" indicates that someone has thought about it
> and it's not just some kind of unfilled template or other oversight,
> and in particular that it's not something like what we have in (b).
I take the point about distinguishing dimensionless quantities
from "unit not applicable" and "author hasn't thought about units".
I don't think that using the empty string as distinct from null is
suitable for this (or anything else), since those two inevitably
get confused with each other (as above). Given that, allowing "1" in
the syntax for that purpose is probably reasonable for metadata authors
who want to make the point that a quantity really is dimensionless.
But I don't think it's reasonable to expect every dimensionless
value to supply an explicit indication of that; mostly it won't
serve any useful purpose and authors probably won't do it anyway.
It can also get in the way; one obvious thing that people do with
units is quote them for human consumption, e.g. label an axis.
In that context "count" is more readable than "count / 1".
> (b) Telling apart unitless values
>
> In contrast to dimensionless quanitites, these (in my nomenclature;
> I've frankly not researched yet if there's an official terminology
> for this) are things that cannot acquire a unit even if you multiplied
> with a unit-carrying other value. Think of URIs, author names, or
> obscore's calibration levels.
>
> The first question is: do we need a distinction between this case and
> the dimensionless case? If so, what for?
>
> In principle, a client knowing that calib_level is really unitless
> could raise an error if someone tries to compute calib_level*(3*u.km)
> or so; but it's not clear to me whether that's enough of a benefit to
> even bother to introduce the terminology, let alone some feature(s).
>
> So: Would anyone champion the need for that distinction? And if so,
> would you be happy with saying "leave out any unit attribute for
> unitless, give an empty string/the 1 for dimensionless"?
My take: unit metadata is there so that data providers can communicate
to data consumers what units are attached to values in cases where
that is useful information. Where it's not, it's OK to leave it out.
It's not necessary to supply a unit metadata item for every supplied value.
So I wouldn't support these additional categories of things-without-units.
Mark
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk http://www.star.bristol.ac.uk/~mbt/
More information about the semantics
mailing list