VODataService
Patrick Dowler
patrick.dowler at nrc-cnrc.gc.ca
Wed Dec 10 10:42:37 PST 2008
I ran into an odd table/column metadata case when messing with our TAP
prototype. In the database I have a column with a URI in it. The URI is
meaningless outside our own software as it is a private identifier, but it
can be turned into a URL by some custom code (the scheme handler for the
subsystem that "owns" the URI).
As a result, and to be useful to users, I want to express that there is a
column with a URL in it and that users will be able to download data via this
URL. I can easily (enough) convert the URI to URL while writing the VOTable
output. The catch is that users cannot sensibly use the column in the WHERE
clause. For example, if someone did
SELECT * FROM some_table
they would see in the result a column called "download" (eg) and values like:
http://www.example.com/foo?bar=123
ftp://ftp.example.com/something.fits
However, if they then tried to something like:
SELECT * FROM some_table AS t
WHERE t.download LIKE 'http://%'
to only get the http downloads, they would get no results (because the table
contains URIs that get turned into http URLs on output). It would be quite
complex to convert that LIKE predicate into something that returned
the "correct" result (it would require knowledge of all possible URI->URL
conversions).
I can see several ways to proceed:
1. Don't do that and actually store URLs in the db: very brittle and doesn't
allow one to direct downloads (eg like the AccessRef in DAL protocols), most
likely requires DB modification as people would not likely put the full URL
in the DB
2. Do not expose the URL column at all: for an observation DB, that would mean
people cannot actually download the data :-(
3. User needs knowledge of how to convert the URI to URL: this requires that
all URI schemes be standardised and service providers cannot use custom ones
(which is a highly useful and flexible and one we use extensively) or that
users need extra out-of-band knowledge to use the service fully
4. Have columns that users can select but not constrain: basically the service
marks columns in its table metadata as "select only", which covers computed
columns like above and also columns where searching is not feasible (e.g.
columns with binary values or region columns in a db without a spatial
querying implementation)
Obviously, since I posted this to registry (re: VODataService) I think #4 is
the most viable option.
Thoughts?
--
Patrick Dowler
Tel/Tél: (250) 363-0044 | fax/télécopieur: (250) 363-0045
Canadian Astronomy Data Centre | Centre canadien de donnees astronomiques
National Research Council Canada | Conseil national de recherches Canada
Government of Canada | Gouvernement du Canada
5071 West Saanich Road | 5071, chemin West Saanich
Victoria, BC | Victoria (C.-B.)
More information about the registry
mailing list