Is XPATH the way to search a data model?

Ed Shaya Edward.J.Shaya.1 at gsfc.nasa.gov
Mon May 17 14:43:44 PDT 2004



David Berry wrote:

>Obviously, the ability to search our data models will be very important,
>but should we just assume that XPATH is the best way to do it? My question
>is, should we be optimising our data models specifically so that they can
>be searched using XPATH? This seems to be the general assumption, but I
>have two questions with this:
>
>1) Does not the fact that XPTAH is a specifically XML thing not mean that
>it is more to do with data >formats< than data >models<? Fine if you
>serialise your data as canonical XML but what if you use (e.g.)
>stand-alone FITS files, or in some non-canonical XML format for which your
>XPATH expressions are not valid?
>  
>
To provide searchability on FITS files one can extract the important 
headers into an XML file with URL pointing to the parent FITS file.

>2) Can it have astronomical knowledge built into it, or is it just a
>sort of dumb regexp system for structured text? What I mean is, if for
>instance you searched a StandardQuantity for a Frame (Frame "A") holding
>the 3 axes:
>
>(heliocentric radio velocity, ICRS RA, ICRS Dec)
>
>could XPATH do anything sensible if the StdQ did not contain this exact
>Frame, but instead contained a Frame (Frame "B") containing the 3 axes:
>
>(Galactic longitude, geocentric frequency, galactic latitude)
>
>?? Obviously Frame A and Frame B are not identical, but given a
>position in Frame B it is possible to convert it into Frame A without
>needing any extra information over and above that stored in the
>Frames. What we want is a search system which has enough astronomical
>knowledge to be able to do this. What you want from your search system is
>for it to say "no I cannot find Frame A but Frame B looks very similar and
>I can give you a Mapping which will convert positions in Frame B into the
>corresponding positions in Frame A". Such recoverable mis-matches between
>what the client wants and what the server can provide is bound to happen
>over and over again in the VO. So can XPATH be used to do this sort of
>searching? If not, then should we not drop XPATH in favour of building
>customised intelligent searching into our code which searches the data
>model itself rather than just some specific data format?
>
>  
>
The registry provides the queryable terms.  A translation service is 
needed between the query generation and the transmissions to each data 
center according to the data centers registered terms.  The XPATH, as 
far as I am aware, is only used at the last stage of extracting info 
from the internal meta-database.  So, yes we need a general model to 
allow such transformations service to be built.  But, eventually you 
need a mechanism of getting info out of metadata stored or transmitted 
as XML, this is always XPATH based (ie XSL, XQUERY, XQL, etc are based 
on XPATH).

>My questions may have revealed that I have little experience with
>XPATH, but more experience with creating customised intelligent searching
>code such as that outlined above. Hence the questions...
>
>David
>
>  
>



More information about the dm mailing list