Unicode in VOTable
Mark Taylor
m.b.taylor at bristol.ac.uk
Fri Jun 13 10:21:38 CEST 2025
On Wed, 11 Jun 2025, Mark Taylor via apps wrote:
> The downside is that a FIELD with datatype="char" arraysize="8"
> can't store an 8-character string if those characters are emojis.
Following up this point, I would say that I don't expect it to
affect very many tables/columns. In many cases a fixed-width
string will have some well-constrained format such as a sexagesimal
designation or ISO-8601 date with a fixed precision, a bibcode,
a UUID, a version string, .... For these examples and many similar
ones it is known that the content will be 7-bit ASCII.
I haven't attempted to gather evidence for this, but my guess would
be that the majority of fixed-width strings in e.g. TAP tables fall
into that category.
String fields that might contain non-ASCII characters (names,
descriptions, comments) are more likely to be the sort of thing
for which a fixed-length value is not so appropriate anyway.
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk https://www.star.bristol.ac.uk/mbt/
More information about the apps
mailing list