VODataService
glemson
gerard.lemson at mpe.mpg.de
Wed Dec 10 22:33:53 PST 2008
Hi Pat
It seems to me there might be another possibility to solve your case,
implement a toURL(..) UDF on your database.
Users could now use the toURL(pats_uri) in the where clause, though it would
be very inefficient. But it would be clear to them from the fact that a UDF
is needed.
To me the big advantage is that it makes explicit to the user what is
happening, which I think is ALWAYS good. It seems to solve the issues with
1,2,3 and you do not need extra concepts such as select-only columns.
Of course I assume it is possible for you to implement a UDF with this
functionality.
Could you explain what is exactly the use case here?
Do you want/need to keep the URI column itself available to the user?
If you wish to hide the URI column, you could create a view that contains
the toURL(pats_uri) in the selec, not the pats_uri column itself, and only
publish the view. (I know you know this.). If your database allows you oculd
also use the UDF to define a derived column on the table.
For now I am against your solution 4.
Does the select-only concept have any precedent elsewhere, do you have
references to this? It will require pretty advanced support in ADQL parsers,
though it is not an ADQL concept (ADQL being non-semantic).
As far as I know, but I may be mistaken it goes beyond standard RDB
concepts.
And it seems not to be necessary.
Gerard
Patrick Dowler writes:
>
> I ran into an odd table/column metadata case when messing with our TAP
> prototype. In the database I have a column with a URI in it. The URI is
> meaningless outside our own software as it is a private identifier, but it
> can be turned into a URL by some custom code (the scheme handler for the
> subsystem that "owns" the URI).
>
> As a result, and to be useful to users, I want to express that there is a
> column with a URL in it and that users will be able to download data via this
> URL. I can easily (enough) convert the URI to URL while writing the VOTable
> output. The catch is that users cannot sensibly use the column in the WHERE
> clause. For example, if someone did
>
> SELECT * FROM some_table
>
> they would see in the result a column called "download" (eg) and values like:
>
> http://www.example.com/foo?bar=123
> ftp://ftp.example.com/something.fits
>
> However, if they then tried to something like:
>
> SELECT * FROM some_table AS t
> WHERE t.download LIKE 'http://%'
>
> to only get the http downloads, they would get no results (because the table
> contains URIs that get turned into http URLs on output). It would be quite
> complex to convert that LIKE predicate into something that returned
> the "correct" result (it would require knowledge of all possible URI->URL
> conversions).
>
> I can see several ways to proceed:
>
> 1. Don't do that and actually store URLs in the db: very brittle and doesn't
> allow one to direct downloads (eg like the AccessRef in DAL protocols), most
> likely requires DB modification as people would not likely put the full URL
> in the DB
>
> 2. Do not expose the URL column at all: for an observation DB, that would mean
> people cannot actually download the data :-(
>
> 3. User needs knowledge of how to convert the URI to URL: this requires that
> all URI schemes be standardised and service providers cannot use custom ones
> (which is a highly useful and flexible and one we use extensively) or that
> users need extra out-of-band knowledge to use the service fully
>
> 4. Have columns that users can select but not constrain: basically the service
> marks columns in its table metadata as "select only", which covers computed
> columns like above and also columns where searching is not feasible (e.g.
> columns with binary values or region columns in a db without a spatial
> querying implementation)
>
More information about the registry
mailing list