String character range

Mark Taylor m.b.taylor at bristol.ac.uk
Thu Aug 21 01:28:36 PDT 2008


On Wed, 20 Aug 2008, Doug Tody wrote:

> Well this is amusing.  In the process of testing some SSA code I
> generated a VOTable which could not be read in either Topcat or in
> a Microsoft XML viewer I have on Windows, although it was processed
> fine in some other programs.
>
> The offending text turned out to be the following:
>
>    <PARAM ID="ContactName" datatype="char"
>    name="ContactName" ucd="meta.bib.author;meta.curation"
>    utype="spec:Spectrum.Curation.Contact.Name" value="László Dobos"
>    arraysize="*">
>
> The problem of course is that UTF-8 extensions were used in "László".
> So, this obviously can cause problems with existing software; in this
> case though it was legal XML.

Ho.  I am quite surprised, since TOPCAT (and presumably the MS viewer) 
use off-the-shelf XML parsing components which I'd expect to be handle 
all this sort of thing correctly; so my suspicion would be that there's 
something subtly wrong with the XML.  If that's not the case I'd like
to understand what went wrong anyway.  Could you pass me the XML file
so I can take a look?

In any case, as you suggest, this does underline the point that keeping
it simple leaves less to go wrong.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/


More information about the apps-samp mailing list