STC in VOResource records

Arnold Rots arots at head.cfa.harvard.edu
Fri Dec 15 06:50:29 PST 2006


Paul,

With all due respect, I disagree, as I have argued before.
I whole-heartedly agree that it is perfectly possible to create
nonsense associations and that we need to rely on other means to guard
against that.
And I agree that changing from ID/IDREF allows validation against the
schema.
But it does not address the underlying (and in my opinion far more
important) problem: we need a mechanism that allows to specify
unambiguous associations - and I don't believe that problem is limited
to STC.  The issue is that if we allow identical association tags
(whether they be IDs or strings) in a (concatenated) document, the
associations become ambiguous.  What we need, therefore, is a
mechanism or a convention that ensures the creation of unique tags;
and once that is in place, it is immaterial whether they are IDs or
strings.
Put differently: the validation problem arises from the datatypes,
agreed, but if you solve that by changing the datatypes you have
introduced a more serious problem: ambiguous associations; and I'm
sure unambiguous associations are not only needed in STC.

Hence the proposal that Jonathan and I made, yesterday.

  - Arnold


Paul Harrison wrote:
> I have said some of this in private emails - but I am resummarizing  
> for the list
> 
> 
> On 14.12.2006, at 20:27, Arnold Rots wrote:
> 
> > Let's assume, for the sake of argument, that we are using ID/IDREF
> > pairs, though that is not essential (as I said before, the issue is
> > that the association needs to be unambiguous, not what the particular
> > datatype is).
> 
> The problem *only* arises because the <AstroCoordSystem> id  
> attribute  and the <AstroCoordArea> coord_system_id attribute are of  
> ID and IDREF type - it is because the XML parser requires global  
> uniqueness of IDs in a document and that IDREFs point to IDs that  
> there is a problem with the XML validity of a harvest document,  
> because each VOResource record was using "human readable" IDs -e.g.  
> UTC-FK5-TOPO that are fine if each VOResource is a document on its  
> own, but become a problem for a harvest document of many such  
> VOResource elements. However, if these two attributes were typed as  
> strings then the XML parser would not try to enforce the uniqueness  
> and referential constraints - it would be up to an external system to  
> ensure the "STC validity" of a document.
> 
> Having the id and coord_system_id attributes as ID/IDREF does not  
> anyway guarantee the "STC validity" of a document anyway as all that  
> the XML parser checks is that the IDREF points at an ID somewhere  
> globally in the document - there is no guarantee that the  
> coord_system_id actually points at the id attribute of an  
> AstroCoordSystem - it can point to *any* id type - so the following  
> document is xml valid, but is obviously nonsense STC as all of the  
> IDREFs point at the ObsDataLocation id.
> 
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <p:ObsDataLocation id="idvalue0" idref="idvalue0" ucd=""  
> xmlns:p="http://www.ivoa.net/xml/STC/stc-v1.30.xsd"  
> xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http:// 
> www.w3.org/2001/XMLSchema-instance">
>    <p:ObservatoryLocation id="idvalue1">
>    <p:AstroCoordSystem id="idvalue3"></p:AstroCoordSystem>
>    <p:AstroCoords coord_system_id="idvalue0"></p:AstroCoords>
>    </p:ObservatoryLocation>
>    <p:ObservationLocation id="idvalue2" idref="idvalue0" >
>    </p:ObservationLocation>
> </p:ObsDataLocation>
> 
> I had earlier argued for using the xs:unique and xs:keyref schema  
> constructs to be used which could potentially be used to define the  
> exact scope of these references, but that would require some thought  
> as the scope always has to be within one of the global elements of  
> STC, which might end up restricting the use of STC itself in other  
> schema - in short, this is not a quick solution, but would require  
> careful consideration.
> 
> In conclusion *not* using ID/IDREF (and making the attributes  
> xs:string or xs:anyURI) is IMHO the quickest and simplest solution to  
> the immediate problem - it allows all current uses to STC still to  
> work, allows the registry harvest document to be valid (with no  
> changes) and gives breathing space to come up with a referencing  
> scheme that is not directly checked by the XML parser, but by a yet  
> to be written STC validator.
> 
> Paul Harrison
> ESO Garching
> www.eso.org
> 
--------------------------------------------------------------------------
Arnold H. Rots                                Chandra X-ray Science Center
Smithsonian Astrophysical Observatory                tel:  +1 617 496 7701
60 Garden Street, MS 67                              fax:  +1 617 495 7356
Cambridge, MA 02138                             arots at head.cfa.harvard.edu
USA                                     http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------



More information about the registry mailing list