VODataService

Francois Ochsenbein francois at vizir.u-strasbg.fr
Fri Dec 12 10:57:20 PST 2008


Hi Pat, Gerard,

Just a short remark about this question of select-only columns:
there is a placeholder in VOTable since its version 1.0 for this
specification (currently defined in the Appendix B of the VOTable
document) with a dedicated 'type' attribute in the FIELD element.
It is called 'non-queryable field' in VOTable, but it matches
exactly your concerns: some columns can't be used for queries.

Therefore if the registry defines this peculiarity, it can
readily be conveyed in a VOTable.

Have a nice week-end !

Francois

>
>On 2008-12-10 22:33:53 glemson wrote:
>> Hi Pat
>>
>> It seems to me there might be another possibility to solve your case,
>> implement a toURL(..) UDF on your database.
>> Users could now use the toURL(pats_uri) in the where clause, though it
>> would be very inefficient. But it would be clear to them from the fact that
>> a UDF is needed.
>>
>> To me the big advantage is that it makes explicit to the user what is
>> happening, which I think is ALWAYS good. It seems to solve the issues with
>> 1,2,3 and you do not need extra concepts such as select-only columns.
>> Of course I assume it is possible for you to implement a UDF with this
>> functionality.
>
>I agree that with ADQL it is important that things be explicit and I am not 
>100% happy about any of the ideas. However, the assumption that one can 
>implement a UDF is problematic at best. The problem is that when one uses 
>URIs a common practice (and one we have made good use of recently) is that 
>different subsystems provide code or services to do that conversion and they 
>register their scheme and code/service. This is all implementation detail but 
>the point is that it may not be technically feasible to actually run that 
>conversion inside the DB in a UDF. Secondarily, I think any solution that 
>requires a UDF is not implementable by some people, for example when there is 
>an existing DB and TAP access is to be provided but the TAP implementors 
>cannot modify the DB in any way (e.g. they have to do everything in the 
>service code). 
>
>
>> Could you explain what is exactly the use case here?
>> Do you want/need to keep the URI column itself available to the user?
>
>The URI is the thing actually stored in the db. In our case, it is a real URI, 
>but I could image in other cases (existing DBs) that there are several 
>columns that together tell one enough to construct a URL. This would be the 
>case in any DB that contains metadata for data files that users would want to 
>download. 
>
>I don't think there is value in exposing the URIs directly (or the multiple 
>columns that are used to generate the URL), so the table metadata would say 
>there is a column with URLs in it.
>
>> If you wish to hide the URI column, you could create a view that contains
>> the toURL(pats_uri) in the selec, not the pats_uri column itself, and only
>> publish the view. (I know you know this.). If your database allows you
>> oculd also use the UDF to define a derived column on the table.
>
>Sure, once you have a UDF you can perform some indirection to expose something 
>different than your underlying tables.
>
>> For now I am against your solution 4.
>> Does the select-only concept have any precedent elsewhere, do you have
>> references to this? It will require pretty advanced support in ADQL
>> parsers, though it is not an ADQL concept (ADQL being non-semantic).
>> As far as I know, but I may be mistaken it goes beyond standard RDB
>> concepts.
>
>Yes, it is non-standard RDBMS and SQL. It has precedent in various DAL 
>services where the output can (does) contain things that the caller cannot 
>constrain in the request. Specifically, the AccessRef in DAL services is a 
>URL to the data but you cannot constraint it and I very much doubt that 
>anyone actually stores the full URL in the database. And generally it does 
>not make much sense to constrain such a column, but in TAP and ADQL users can 
>do that if the column is visible.  
>
>> And it seems not to be necessary.
>
>Ah, but only if one accepts that implementing UDFs in the DB is necessary, and 
>I think a lot of implementors will be unwilling or unable to do that, for a  
>variety of technical reasons.
>
>> Gerard
>>
>> Patrick Dowler writes:
>> > I ran into an odd table/column metadata case when messing with our TAP
>> > prototype. In the database I have a column with a URI in it. The URI is
>> > meaningless outside our own software as it is a private identifier, but
>> > it can be turned into a URL by some custom code (the scheme handler for
>> > the subsystem that "owns" the URI).
>> >
>> > As a result, and to be useful to users, I want to express that there is a
>> > column with a URL in it and that users will be able to download data via
>> > this URL. I can easily (enough) convert the URI to URL while writing the
>> > VOTable output. The catch is that users cannot sensibly use the column in
>> > the WHERE clause. For example, if someone did
>> >
>> >    SELECT * FROM some_table
>> >
>> > they would see in the result a column called "download" (eg) and values
>> > like:
>> >
>> >    http://www.example.com/foo?bar=123
>> >    ftp://ftp.example.com/something.fits
>> >
>> > However, if they then tried to something like:
>> >
>> >    SELECT * FROM some_table AS t
>> >    WHERE t.download LIKE 'http://%'
>> >
>> > to only get the http downloads, they would get no results (because the
>> > table contains URIs that get turned into http URLs on output). It would
>> > be quite complex to convert that LIKE predicate into something that
>> > returned the "correct" result (it would require knowledge of all possible
>> > URI->URL conversions).
>> >
>> > I can see several ways to proceed:
>> >
>> > 1. Don't do that and actually store URLs in the db: very brittle and
>> > doesn't allow one to direct downloads (eg like the AccessRef in DAL
>> > protocols), most likely requires DB modification as people would not
>> > likely put the full URL in the DB
>> >
>> > 2. Do not expose the URL column at all: for an observation DB, that would
>> > mean people cannot actually download the data :-(
>> >
>> > 3. User needs knowledge of how to convert the URI to URL: this requires
>> > that all URI schemes be standardised and service providers cannot use
>> > custom ones (which is a highly useful and flexible and one we use
>> > extensively) or that users need extra out-of-band knowledge to use the
>> > service fully
>> >
>> > 4. Have columns that users can select but not constrain: basically the
>> > service marks columns in its table metadata as "select only", which
>> > covers computed columns like above and also columns where searching is
>> > not feasible (e.g. columns with binary values or region columns in a db
>> > without a spatial querying implementation)
>
>
>
>-- 
>
>Patrick Dowler
>Tel/Tél: (250) 363-0044                  | fax/télécopieur: (250) 363-0045
>Canadian Astronomy Data Centre   | Centre canadien de donnees astronomiques
>National Research Council Canada | Conseil national de recherches Canada
>Government of Canada                  | Gouvernement du Canada
>5071 West Saanich Road               | 5071, chemin West Saanich
>Victoria, BC                                  | Victoria (C.-B.)
================================================================================
Francois Ochsenbein       ------       Observatoire Astronomique de Strasbourg
   11, rue de l'Universite F-67000 STRASBOURG       Phone: +33-(0)390 24 24 29
Email: francois at astro.u-strasbg.fr   (France)         Fax: +33-(0)390 24 24 32
================================================================================



More information about the registry mailing list