Element Ordering in XML (was: Is VOTable terribly broken?)

Arnold Rots arots at head.cfa.harvard.edu
Wed Apr 29 10:36:29 PDT 2009


Roy posed the question of whether the order of repeated elements is
guaranteed in XML (as we discussed yesterday) to the VOTable mailing
list.
I pass on the combined response from Doug Tody and Norman Gray.
The consensus appears to be that the order IS significant and
preserved.
We may need to revisit this.

  - Arnold

----- Forwarded message from Norman Gray -----

X-Original-To: votable at pat.hq.eso.org
Delivered-To: votable at pat.hq.eso.org
From: Norman Gray <norman at astro.gla.ac.uk>
To: VOTable List <votable at ivoa.net>
Subject: Re: Is VOTable terribly broken?
Date: Wed, 29 Apr 2009 17:15:07 +0100


Greetings,

I'm adding chapter and verse to what Doug said, just so this nonsense  
can be stomped on promptly the next time it rears its head.

On 2009 Apr 29, at 15:00, Douglas Tody wrote:
...
>
> Before we close this off, I just want to confirm what Mike said,
> after looking at this a bit more this morning.  While it appears that
> technically XML does not guarantee the order of elements

Section 3.1 of the XML spec [1] says "Note that the order of attribute  
specifications in a start-tag or empty-element tag is not  
significant", and so by implication, the order of children within an  
element is significant.

The spec states (Section 1, para 3) that:

> This specification describes the required behavior of an XML  
> processor in terms of how it must read XML data and the information  
> it must provide to the application.


As far as I can find, the spec doesn't say explicitly that an XML  
processor must report elements in document order, largely, I suppose,  
because it's so obviously true that it needn't be made explicit.  The  
SGML spec doesn't state this explicitly either.  In any case, this is  
largely the responsibility of the infoset (as Doug mentions):

> higher level
> constructs like SAX, the XML infoset

Section 2.2 of the Infoset document [2] states that the information  
associated with a parsed element includes:

> 	4. [children] An ordered list of child information items, in  
> document order. [...]
> 	5. [attributes] An unordered set of attribute information items,  
> one for each of the attributes (specified or defaulted from the DTD)  
> of this element.

The Infoset is intended to be informative, in formalising the  
terminology used by other XML-related specifications:

> This specification defines an abstract data set called the XML  
> Information Set (Infoset). Its purpose is to provide a consistent  
> set of definitions for use in other specifications that need to  
> refer to the information in a well-formed XML document [[2] section  
> 1, para 1]

It therefore doesn't constrain anything, in the sense that it renders  
something non-conformant if it garbles the order of child elements.   
However, if an XML parser talks about 'children', then it must be  
presumed to be using the term in the sense defined in the Infoset  
document, which implies that children elements are reported in  
document order.

Thus this repeated claim that 'some XML parsers report elements in a  
different order' is just superstition.  To write such a parser would  
be perverse, and to use it in an application would be foolish.  I have  
difficulty believing that there are any XML parsers which actually do  
this (possibly excluding some specialised ultra-minimal ones which  
wouldn't be used in any general-purpose context).

All the best,

Norman



[1] http://www.w3.org/TR/REC-xml/
[2] http://www.w3.org/TR/xml-infoset/

-- 
Norman Gray  :  http://nxg.me.uk
Dept Physics and Astronomy, University of Leicester

----- End of forwarded message from Norman Gray -----
--------------------------------------------------------------------------
Arnold H. Rots                                Chandra X-ray Science Center
Smithsonian Astrophysical Observatory                tel:  +1 617 496 7701
60 Garden Street, MS 67                              fax:  +1 617 495 7356
Cambridge, MA 02138                             arots at head.cfa.harvard.edu
USA                                     http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------



More information about the voevent mailing list