Unicode in VOTable
Mark Taylor
M.B.Taylor at bristol.ac.uk
Mon Aug 18 09:55:03 PDT 2014
On Fri, 15 Aug 2014, Walter Landry wrote:
> Mark Taylor <m.b.taylor at bristol.ac.uk> wrote:
> > On Thu, 14 Aug 2014, Markus Demleitner wrote:
> >
> >> Now, if we go this way: Why have a new type at all? I'd maintain no
> >> existing valid VOTable would break if we just said something essentially
> >> like:
> >>
> >> VOTable considers char as byte streams that can be decoded from utf-8
> >> for presentation purposes. TABLEDATA encoding is presentation.
> >> arraysize refers to the length of the bytestream always, never to
> >> the length of any unicode code sequence decodeable from the byte
> >> stream.
> >
> > Yes, I think that would work. "TABLEDATA encoding is presentation"
> > seems like a rather radical statement in terms of the way one
> > usually thinks about VOTable, but I can't think of any actual
> > negative consequences.
>
> This sounds a lot like what I proposed back in March, so I like it
> too ;) It would be good if we could do the same thing for unicodeChar
> and UTF-16.
Maybe. UCS-2, though it's archaic (obsolete?) does retain the
assurance that the number of characters can be determined from
the arraysize. If you can do UTF-8 in char then it could be
worth retaining what's currently unicodeChar for that purpose,
especially since it's not likely to be used for any other reason
when theres a UTF-8 alternative.
I've updated the VOTableIssues13 wiki page a little bit in view
of this thread. Anybody else feel free to edit away too.
Mark
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/
More information about the apps
mailing list