The IVOA in 2006: Assessment and Future Roadmap - Registries

Ray Plante rplante at ncsa.uiuc.edu
Thu Jun 8 10:09:49 PDT 2006


On Thu, 8 Jun 2006, Roy Williams wrote:
> > The registry data model (for better or worse) has always been  
> > defined in terms of the XML Schema language, and there is a very  
> > natural candidate for a query language for XML, namely XQuery
> 
> I wonder about relational versus XML schema. Are they really  
> different -- and therefore the query languages should be different?  
> Or are they just different representations of the same thing?

They can be made to be the same thing under certain conditions.  

What we found in the development of RI is that the RDBMS-oriented language
(ADQL) as it is defined is functionally different from XQuery.  In
particular, there are queries that we can do in XQuery that we cannot do
with ADQL *within the current requirements of RI*.

The difference comes down to the data model managed by the underlying 
database.  With the XML database, the model is defined by the XML schemas, 
which is known to the user.  With relational databases, it is possible to 
create a table model that maps, under a set of rules, exactly to the XML 
schema.  If we mandated the internal table model and exposed it to 
clients, then one could form identical queries.  

However, RI does not mandate how the XML should be mapped into the 
internal RDBMS.  As anyone who maintains databases for real knows that a 
strictly normalized model is not always the most efficient nor the most 
maintainable.  One usually tweaks/tunes the "ideal" model to optimize to 
the needs of the users and constraints of the administrators.  Performance 
is actually the primary reason we use an RDBMS in the STScI registry.  

To get around having to impose a very complex table model on RDBMSs, we 
have defined an ADQL-based query interface that assumes a flat-like 
(i.e. single table) view of the data being queried.  This is the easiest 
model to explain to users (that need to know the model to form their own 
complex queries).  The short coming is that relational information is lost 
as a result.  

> I know there is formal mathematics about relational databases (Codd  
> and Date). Is there substantial formal theory on translating XML to  
> and from RDBMS?

Yes.  However, the results are not always optimal.

> In other words, can I translate from XML to relational schema in an  
> automatic way?
> Can I translate back (automatically) and get the same as the XML  
> schema I started with?

You can certainly do this with a proper underlying DB model.  You 
cannot do this with the model view that is assumed for querying purposes.  

cheers,
Ray




More information about the registry mailing list