String character range

Mark Taylor m.b.taylor at bristol.ac.uk
Thu Aug 28 08:41:43 PDT 2008


On Thu, 28 Aug 2008, Luigi Paioro wrote:

> However Dough's problem with that VOTable has suggested me a possible 
> scenario that could require UTF-8 support (with the XML constraints), maybe 
> introducing an additional data type. Well, this is the scenario: I get from a 
> SSA (TAP, SIA, whatever) service a VOTable which contains UTF-8 chars (no 
> matter what) and I get it using a VO enabled application; after some 
> elaborations I broadcast it to one or more other applications in an 
> asynchronous way using SAMP. This simple operation can be done in two ways:
>
> i) by reference: the VOTable is written in a file (local or remote) and the 
> reference to such a file is sent with a proper MType as a simple ASCII string 
> (e.g. "file:///tmp/myvotab.vot", "ivo://my.vospace.address/myvotab.vot", 
> etc.)
>
> ii) by value: the content of the VOTable is sent as a byte stream, still 
> using a proper MType. This byte stream can simply be a string UTF-8 encoded.
>
> Case i) requires only ASCII charset supported, while ii) requires support for 
> UTF-8 (or at least leave it to pass through) or an additional general data 
> type for byte streams (which I suspect could be useful even for other 
> purposes).
>
> If ASCII charset only (with the said limits) is allowed in a SAMP message, 
> then only case i) is allowed. If someone wished the possibility of passing a 
> data by value (case ii) then I think the discussion would be still long...

Hi Luigi,

the SAMP string type is consciously restricted in what it can contain,
because of the difficulties it might present to transports if it was
allowed to contain an unrestricted sequence of bytes.

Allowing UTF-8 would as you say allow inline transmission of 
VOTables and other XML documents, but it wouldn't solve the more 
general problem of transmitting unrestricted binary data inline 
(e.g. FITS files).

So if we want to solve the general problem of transmitting bulk data 
inline rather than by reference we are going to need some more general
way of doing it - maybe base-64 encoding within a string type.

The question is whether XML/VOTable is a sufficiently useful special
case that it's worth making specific accommodation for it in the
SAMP data types.  My feeling is that it's not, in view of the 
additional conceptual complication it introduces to the standard
(though as I've already admitted, I wouldn't expect it to cause
many things to break in practice).  But others might disagree.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/



More information about the apps-samp mailing list