Unicode in VOTable
Walter Landry
wlandry at caltech.edu
Fri Mar 7 11:00:39 PST 2014
Hello Everyone,
I tried sending this to votable at ivoa.net, but that mailing list seems
unattended and the message never went through. In any case, in the
VOTable Format Definition Version 1.3, there are the statements
VOTables support two kinds of characters: ASCII 1-byte characters
and Unicode (UCS-2) 2-byte characters. Unicode is a way to
represent characters that is an alternative to ASCII. It uses two
bytes per character instead of one, it is strongly supported by XML
tools, and it can handle a large variety of international
alphabets.
This is not actually true. Unicode, in general, requires 4 bytes per
character. There are encodings, such as UTF-16, which often only
require 2 bytes, but even UTF-16 sometimes requires more than 2 bytes
to express a character.
So, how would I express a generic unicode character in a VOTable? Do
I encode it as UTF 8 and disguise it as ASCII?
Thanks,
Walter Landry
More information about the apps
mailing list