The IVOA in 2006: Assessment and Future Roadmap - Registries
Ed Shaya
eshaya at umd.edu
Fri Jun 9 07:40:21 PDT 2006
Brian's description here may be difficult for the uninitiated to
understand. He is saying that if one has data in a relational DB (or
clearly belongs in one) it is easy to incorporate pointers to the
columns into XML documents and wrap them withth full metadata
descriptions. It is then possible to create a virtual XML document in
which it appears that all of the data is properly nested and hence one
can use just XQuery. The XQuery on the virtual document is transformed
into a combination of XQuery on the metadata and SQL for the real DB and
the output is user specified XML.
He is not saying that one can go the other way. You simply would not
want to take ARBITRARY XML data and transform it into PURELY relational
data because there can be tremendous inefficiencies in trying to make
XQuery type queries by using SQL only. When an arbitrary tree-structure
is transformed to relational structures, it could require large chains
of tiny tables with associatations to other tiny tables with infinite
recursion being permitted and very complex sets of foreign keys. Also,
SQL/Relational databases would have trouble with queries like "all A
elements that are the 3rd sibling and 2 layers down from B elements that
are 2nd B at that level."
To answer Roy's original question tree-structure does not map into
row-column structure, period. Relational DB permit associations between
tables that allow for a limited tree-structure. But if the XML is very
tree-like, it is insane to transform it into a relational structure. In
practice though, many XML documents are not terribly tree-like (often
because they are generated from a set of tables), so it depends on the
situation.
Ed
Brian Thomas wrote:
>On Thursday 08 June 2006 14:30, Tony Linde wrote:
>
>
>>And the prime difficulty is not mapping from the xml schema to the
>>relational schema but in mapping an XQuery statement to some underlying
>>database language: SQL would not be enough, you'd need the stored procedure
>>language as well and even that may not provide everything XQuery does, even
>>if you could code the mapping.
>>
>>
>
> The mapping is not that bad, and its simple actually... it basically means
> mapping column values into a special node (say 'value') in the XML document
> which may be treated specially and is logically thought of as holding lists of
> values.
>
> To explain with an example, consider that you have some meta-data stored in the
> XML document itself, and some (meta-) data stored in the RDBMS, as needed.
> The XML looks like:
>
> <mydoc>
> <elementWithRDBMSvalues name="thisField">
> <metaDataNode>meta-data value</metaDataNode>
> <orml:value dbId="myDb" dbTable="myTable" dbCol="myColumn"/>
> </elementWithRDBMSvalues>
> ....
> </mydoc>
>
> Now, the XQuery to pull back (all of ) the data in that node :
>
>declare namespace a="urn:mynamespace";
>
><result> {
> for $field in /a:document//elementWithRDBMSvalues
> return <field> { $field } </c:field>
>} </result>
>
> returns all the possible values in the RDBMS, namely "10 32 43 44 1882" and
> would look like:
>
> <field name="thisField">
> <metaDataNode>meta-data value</metaDataNode>
> <a:value>10 32 43 44 1882</a:value>
> </field>
>
> By adding a "select" function to XQuery, one can get select-like functionality
> on the XQuery nodes which hold the relational values, e.g.
>
>
>declare namespace c="urn:mynamespace";
>declare namespace orml="urn:ormlmappedvalues";
><result> {
>for $field in //a:field
> where $field[@name="thisField"]
> return select { $field where //a:field/orml:value < 30.0 }
>} </result>
>
> Then returns:
>
> <field name="thisField">
> <metaDataNode>meta-data value</metaDataNode>
> <a:value>32 43 44 1882</a:value>
> </field>
>
>
> Having a native XQuery is nice, but doesn't do the full job: you either have to store
> your data in terms of the object model of the VO, something which most of us would
> rather avoid (re-mapping your DB to VO model, yuk) or you have to remap the XQuery
> in terms of your local schema (double yuk). The better solution is to lay an object
> layer on top of the RDBMS which implements a mapping from a shared, community object
> model on top of the local relational schema.
>
> =brian
>
>
>
More information about the registry
mailing list