The IVOA in 2006: Assessment and Future Roadmap - Registries

Gretchen Greene greene at stsci.edu
Thu Jun 8 10:52:42 PDT 2006


To expand a little in the direction of xml mapping to relational DB

for the RDBMS at STScI, we will be upgrading the registry backend with
the schema changes.  There is advanced XML support including Xquery. As
long as the XML structures are well-formed,  the columns can be mapped
to xml types (also Un-typed,  but that's not as ideal for query
performance).  Just as years ago the RDBMS adopted the ootypes in a
hybrid fashion,  the xml trends are influencing the implementations.

While the xml schemas can be standardized,  I'm not sure how much
overlap there will be in the table mapping for different DB
implementations yet we are planning to discuss with ESAC the
capabilities ...

So some of the issues with mapping and flattening will depart from the
classical thinking and experience.
It's not simply treating xml blobs vs. hierarchical storage vs. flat
tables,  the features will be combined and the advantage of course is
optimal query performance and load times.

-Gretchen





-----Original Message-----
From: owner-registry at eso.org [mailto:owner-registry at eso.org] On Behalf
Of Ray Plante
Sent: Thursday, June 08, 2006 1:10 PM
To: registry at ivoa.net
Subject: Re: The IVOA in 2006: Assessment and Future Roadmap -
Registries


On Thu, 8 Jun 2006, Roy Williams wrote:
> > The registry data model (for better or worse) has always been
> > defined in terms of the XML Schema language, and there is a very  
> > natural candidate for a query language for XML, namely XQuery
> 
> I wonder about relational versus XML schema. Are they really
> different -- and therefore the query languages should be different?  
> Or are they just different representations of the same thing?

They can be made to be the same thing under certain conditions.  

What we found in the development of RI is that the RDBMS-oriented
language
(ADQL) as it is defined is functionally different from XQuery.  In
particular, there are queries that we can do in XQuery that we cannot do
with ADQL *within the current requirements of RI*.

The difference comes down to the data model managed by the underlying 
database.  With the XML database, the model is defined by the XML
schemas, 
which is known to the user.  With relational databases, it is possible
to 
create a table model that maps, under a set of rules, exactly to the XML

schema.  If we mandated the internal table model and exposed it to 
clients, then one could form identical queries.  

However, RI does not mandate how the XML should be mapped into the 
internal RDBMS.  As anyone who maintains databases for real knows that a

strictly normalized model is not always the most efficient nor the most 
maintainable.  One usually tweaks/tunes the "ideal" model to optimize to

the needs of the users and constraints of the administrators.
Performance 
is actually the primary reason we use an RDBMS in the STScI registry.  

To get around having to impose a very complex table model on RDBMSs, we 
have defined an ADQL-based query interface that assumes a flat-like 
(i.e. single table) view of the data being queried.  This is the easiest

model to explain to users (that need to know the model to form their own

complex queries).  The short coming is that relational information is
lost 
as a result.  

> I know there is formal mathematics about relational databases (Codd
> and Date). Is there substantial formal theory on translating XML to  
> and from RDBMS?

Yes.  However, the results are not always optimal.

> In other words, can I translate from XML to relational schema in an
> automatic way?
> Can I translate back (automatically) and get the same as the XML  
> schema I started with?

You can certainly do this with a proper underlying DB model.  You 
cannot do this with the model view that is assumed for querying
purposes.  

cheers,
Ray





More information about the registry mailing list