Unicode in VOTable
Mark Taylor
m.b.taylor at bristol.ac.uk
Thu Aug 14 09:35:21 PDT 2014
On Thu, 14 Aug 2014, Markus Demleitner wrote:
> Now, if we go this way: Why have a new type at all? I'd maintain no
> existing valid VOTable would break if we just said something essentially
> like:
>
> VOTable considers char as byte streams that can be decoded from utf-8
> for presentation purposes. TABLEDATA encoding is presentation.
> arraysize refers to the length of the bytestream always, never to
> the length of any unicode code sequence decodeable from the byte
> stream.
Yes, I think that would work. "TABLEDATA encoding is presentation"
seems like a rather radical statement in terms of the way one
usually thinks about VOTable, but I can't think of any actual
negative consequences.
Note though that this change does lose you something: the possibility
to store in a VOTable text data that is known and declared to be
7-bit ASCII. If you're in FITS'n'FORTRAN land such things can
be useful. However, I don't know how many people are really relying
on that in practice at present.
> And then we'd have go on to the ghastly array considerations ("To
> decode multidimensional arrays coming from tabledata serialised
> tables, first create a bytestream by encoding as canonical utf-8 and
> then...").
Agreed something like that should go in, but it's a clarification of
the scheme implied by the earlier text, not an additional complication.
Mark
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/
More information about the apps
mailing list