VODataService

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Wed Dec 10 10:42:37 PST 2008


I ran into an odd table/column metadata case when messing with our TAP 
prototype. In the database I have a column with a URI in it. The URI is 
meaningless outside our own software as it is a private identifier, but it 
can be turned into a URL by some custom code (the scheme handler for the 
subsystem that "owns" the URI). 

As a result, and to be useful to users, I want to express that there is a 
column with a URL in it and that users will be able to download data via this 
URL. I can easily (enough) convert the URI to URL while writing the VOTable 
output. The catch is that users cannot sensibly use the column in the WHERE 
clause. For example, if someone did

   SELECT * FROM some_table

they would see in the result a column called "download" (eg) and values like:

   http://www.example.com/foo?bar=123
   ftp://ftp.example.com/something.fits

However, if they then tried to something like:

   SELECT * FROM some_table AS t 
   WHERE t.download LIKE 'http://%' 

to only get the http downloads, they would get no results (because the table 
contains URIs that get turned into http URLs on output). It would be quite 
complex to convert that LIKE predicate into something that returned 
the "correct" result (it would require knowledge of all possible URI->URL 
conversions).

I can see several ways to proceed:

1. Don't do that and actually store URLs in the db: very brittle and doesn't 
allow one to direct downloads (eg like the AccessRef in DAL protocols), most 
likely requires DB modification as people would not likely put the full URL 
in the DB

2. Do not expose the URL column at all: for an observation DB, that would mean 
people cannot actually download the data :-(

3. User needs knowledge of how to convert the URI to URL: this requires that 
all URI schemes be standardised and service providers cannot use custom ones 
(which is a highly useful and flexible and one we use extensively) or that 
users need extra out-of-band knowledge to use the service fully

4. Have columns that users can select but not constrain: basically the service 
marks columns in its table metadata as "select only", which covers computed 
columns like above and also columns where searching is not feasible (e.g. 
columns with binary values or region columns in a db without a spatial 
querying implementation)

Obviously, since I posted this to registry (re: VODataService) I think #4 is 
the most viable option.

Thoughts?

-- 

Patrick Dowler
Tel/Tél: (250) 363-0044                  | fax/télécopieur: (250) 363-0045
Canadian Astronomy Data Centre   | Centre canadien de donnees astronomiques
National Research Council Canada | Conseil national de recherches Canada
Government of Canada                  | Gouvernement du Canada
5071 West Saanich Road               | 5071, chemin West Saanich
Victoria, BC                                  | Victoria (C.-B.)



More information about the registry mailing list