String character range

Mark Taylor m.b.taylor at bristol.ac.uk
Fri Aug 1 08:15:21 PDT 2008


On Fri, 1 Aug 2008, Carlos Rodrigo Blanco wrote:

> Hi
>
> I'm sorry that I don't know much about unicode encoding and I feel quite 
> ashamed of showing this ignorance, but I wonder what happens with latin 
> characters and so.
>
> If I have to write, for instance, some author name in a xml document that 
> includes some latin character (like ñ), is that allowed?

Writing it in an XML document - no problem.  XML, and Unicode on which
it is based, is very capable at representing almost any character
from almost any language you can think of (and a lot more).

As far as SAMP goes: that character looks to me like code point 0xf1, 
from the Latin-1 Supplement code block.  So you could not send it 
using either the existing definition for a SAMP string or the 
proposal (4) that I am suggesting.  If we used a variant of my 
suggestion (3):

   3. Define some escaping convention for un-XML characters, e.g. \u001f
      for character 31.

with the intention that this escaping mechanism could be used for
any 8-bit character it would be possible to transmit this kind of 
non-7-bit Latin character.  However, characters with the 8th bit 
set might cause problems for certain other transports and language 
environments.  I must admit apart from RFC-822 mail-type contexts 
I can't think of what these might be, but I'd be inclined to steer 
clear of non-7-bit characters just in case.  However, if others 
(e.g. with less Anglo-Saxon prejudices) think that it's an important 
requirement to permit transmission of characters like this within
SAMP we could take that on board.  We could even in principle say 
that this escaping mechanism could be used to specify any Unicode 
character - but I think that would definitely be a bad idea as it 
would effectively restrict use of the protocol to languages with 
Unicode support, which excludes quite a lot.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/


More information about the apps-samp mailing list