String character range

Doug Tody dtody at nrao.edu
Mon Aug 4 09:04:08 PDT 2008


Hi Mark -

Sure, I agree that the range of allowable chars should be restricted
as you suggest.   My suggestion is to specify UTF-8, restricted as
has been discussed for 7-bit chars, but allowing UTF-8 encoded chars
to pass through.  That would seem to do it and we still have simple
ASCII virtually all of the time so I don't think this will break
legacy code.  If at some point full up unicode is needed (eg 16 bit
chars), that should be a different data type.

	- Doug


On Mon, 4 Aug 2008, Mark Taylor wrote:

> On Fri, 1 Aug 2008, Doug Tody wrote:
> 
> > Hey Mark -
> > 
> > I agree with your sentiment that string data which we want to
> > manipulate in any language or environment should be simple; if
> > necessary a separate datatype could be declared for representing
> > e.g. general Unicode encoded text.
> > 
> > What about UTF-8 though?  This is backwards compatible with ASCII
> > but allows any Unicode character to be represented using multi-byte
> > sequences - if there are no funny characters it is the same as ASCII.
> > This is much like your escape sequence proposal, but is a widely used
> > standard.  XML has mandatory support for UTF-8 (almost any XML document
> > one sees is UTF-8 encoded) so there should be no problems there.
> 
> Hi Doug,
> 
> you're right, UTF-8 does look like a better solution than the \uxxxx
> escaping mechanism (borrowed from Java) that I suggested as far as
> transmitting things like accented letters and characters from non-Latin
> alphabets.  However, it doesn't solve the problem which started this
> thread off, since you still won't be able to include characters in
> the ranges excluded by the XML Char definition; those are simply not permitted
> in an XML document, regardless of encoding (and in any
> case the UTF-8 encoding of 0x1f is the single byte 0x1f).
> 
> Mark
> 
> -- 
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
> 



More information about the apps-samp mailing list