The IVOA in 2006: Assessment and Future Roadmap - Registries

Ed Shaya eshaya at umd.edu
Fri Jun 9 07:40:21 PDT 2006


Brian's description here may be difficult for the uninitiated to 
understand.  He is saying that if one has data in a relational DB (or 
clearly belongs in one) it is easy to incorporate pointers to the 
columns  into XML documents and wrap them withth  full  metadata 
descriptions.   It is then possible to create a virtual XML document in 
which it appears that all of the data is properly nested and hence one 
can use just XQuery.  The XQuery on the virtual document is transformed 
into a combination of XQuery on the metadata and SQL for the real DB and 
the output is user specified XML.   

He is not saying that one can go the other way.   You simply would not 
want to take ARBITRARY XML data and transform it into PURELY relational 
data because there can be tremendous inefficiencies in trying to make 
XQuery type queries by using SQL only.  When an arbitrary tree-structure 
is transformed to relational structures, it could require large chains 
of  tiny tables with associatations to other tiny tables with infinite 
recursion being permitted and very complex sets of foreign keys.   Also, 
SQL/Relational databases would have trouble with queries like "all A 
elements that are the 3rd sibling and 2 layers down from B elements that 
are 2nd B at that level."  

To answer Roy's original question tree-structure does not map into 
row-column structure, period.  Relational DB permit associations between 
tables that allow for a limited tree-structure.  But if the XML is very 
tree-like, it is insane to transform it into a relational structure.  In 
practice though, many XML documents are not terribly tree-like (often 
because they are generated from a set of tables), so  it depends on the 
situation.

Ed


Brian Thomas wrote:

>On Thursday 08 June 2006 14:30, Tony Linde wrote:
>  
>
>>And the prime difficulty is not mapping from the xml schema to the
>>relational schema but in mapping an XQuery statement to some underlying
>>database language: SQL would not be enough, you'd need the stored procedure
>>language as well and even that may not provide everything XQuery does, even
>>if you could code the mapping.
>>    
>>
>
>	The mapping is not that bad, and its simple actually... it basically means 
>	mapping column values into a special node (say 'value') in the XML document
>	which may be treated specially and is logically thought of as holding lists of
>	values. 
>
>	To explain with an example, consider that you have some meta-data stored in the 
>	XML document itself, and some (meta-) data stored in the RDBMS, as needed. 
>	The XML looks like:
>
>	<mydoc>
>	    <elementWithRDBMSvalues name="thisField">
>                       <metaDataNode>meta-data value</metaDataNode>
>	         <orml:value dbId="myDb" dbTable="myTable" dbCol="myColumn"/>
>	    </elementWithRDBMSvalues>
>	    ....
>	</mydoc>
>
>	Now, the XQuery to pull back (all of ) the data in that node :
>
>declare namespace a="urn:mynamespace";
>
><result> {
>    for $field in /a:document//elementWithRDBMSvalues
>     return <field> { $field } </c:field>
>} </result>
>
>	returns all the possible values in the RDBMS, namely "10 32 43 44 1882" and 
>	would look like:
>	
>	<field name="thisField">
>                     <metaDataNode>meta-data value</metaDataNode>
>	       <a:value>10 32 43 44 1882</a:value>
>	</field>
>	
>	By adding a "select" function to XQuery, one can get select-like functionality
>	on the XQuery nodes which hold the relational values, e.g.
>
>
>declare namespace c="urn:mynamespace";
>declare namespace orml="urn:ormlmappedvalues";
><result> {
>for $field in //a:field
>  where $field[@name="thisField"]
>  return select { $field where //a:field/orml:value  < 30.0 }
>} </result>
>
>	Then returns:
>	
>	<field name="thisField">
>                     <metaDataNode>meta-data value</metaDataNode>
>	       <a:value>32 43 44 1882</a:value>
>	</field>
>
>	
>	Having a native XQuery is nice, but doesn't do the full job: you either have to store
>	your data in terms of the object model of the VO, something which most of us would
>	rather avoid (re-mapping your DB to VO model, yuk) or you have to remap the XQuery
>	in terms of your local schema (double yuk). The better solution is to lay an object
>	layer on top of the RDBMS which implements a mapping from a shared, community object
>	model on top of the local relational schema.
>
>	=brian
>
>  
>



More information about the registry mailing list