Problems with RegTAP, ADQL, natural joins, and MS SQL Server

Menelaus Perdikeas mperdikeas at sciops.esa.int
Wed Feb 24 21:33:12 CET 2016


Hi Theresa, 

I am pretty set in considering NATURAL JOINS an anti-pattern (mostly due to being opaque) and far from being elegant. 
Personally, I always use aliases since they are very explicit, precise and reduce redundancy in typing as well. 

But I don't want to start a religious discussion. 

We too are using Gregory's parser and have encountered a couple of cases where the specific syntax of queries generated by TopCAT seems to not sit well with the library's ADQL parser or our native back-end's (PostgreSQL 9.2) capabilities, so when you contribute your changes back into the project maybe we should try to see if the changes / extensions solve any of our own issues as well. But our benchmark is not TopCAT but rather the RegTAP validation suite by Markus where we effectively score something like 59/61 (as of release r3726 of the suite). We also check that we can handle all the example queries as given in the RegTAP specification. 

For what is worth, I tried all your example queries (and their suggested translations) in our beta system (to be pushed to the production URL tomorrow according to the latest estimates): 

1nd. sample query (TAP Access URLs) 
* natural join works 
* non-natural join doesn't work and has to be re-written with aliases as: 

SELECT a.ivoid, b.access_url 
FROM rr.capability a, rr.interface b 
WHERE a.ivoid = b.ivoid 
AND a.standard_id like 'ivo://ivoa.net/std/tap%' 
AND b.intf_type='vs:paramhttp' 

Or, alternatively and much closer to your own version, as: 

SELECT rr.capability.ivoid, access_url 
FROM rr.capability, rr.interface 
WHERE rr.capability.ivoid = rr.interface.ivoid 
AND standard_id like 'ivo://ivoa.net/std/tap%' 
AND intf_type='vs:paramhttp' 

2nd. sample query (SIA with spirals) 
* both versions work 

3rd sample query (Infrared SIA) 
* both version work (if one accounts for some obvious typos that the non-natural version has) 

4th sample query (Cone Search with Red Shifts) 
* both versions work 

5th sample query (records from registry) 
* natural query works 
! I am not sure the non-natural version of the query corresponds to the same query (semantically) - maybe a copy-paste mistake? 

6th sample query (locate RegTAP services) 
* both version work but the non-natural join returns 5 records (instead of the 2 that the natural join version does) 

7th sample query (TAP with physics) 
* natural JOIN version works 
* non-natural JOIN returns many more records 

8th sample query (Theoretical SSA) 
* natural JOIN version works 
* non-natural version returns 5 rows instead of two. 

9th sample query (Find Contact Persons) 
* both versions work 

Cheers, 
Menelaus. 


From: "Theresa Dower" <dower at stsci.edu> 
To: registry at ivoa.net 
Sent: Wednesday, February 24, 2016 9:11:36 PM 
Subject: Problems with RegTAP, ADQL, natural joins, and MS SQL Server 



Hello, 



As some of you know from prior discussions and database work of your own, there is an issue with the ADQL standard and the capabilities of MS SQL Server, namely that by design it does not support the NATURAL JOIN syntax. The design decision was based on the ambiguity in how the natural join finds keys in generic schemas. Even with known schemas such as RegTAP’s, parsing and rewriting generic queries with natural joins is far from trivial. 



At this time, TAP services at NAVO institutions and the Royal Observatory Edinburgh are using Microsoft SQL Server as a back end to TAP services. STScI and ROE are extending Gregory Mantele’s Java ADQL parser for this work and will share it back to his project. STScI have put together a RegTAP service which will be operational before the Spring 2016 interop. (An ObsTAP service with basic spatial query support comes next.) The problem is that with our available resources we cannot support natural joins, yet they are used extensively by the example RegTAP queries in TOPCAT, the most commonly used TAP client. 



For our testing, Sarah Weissman at ST created translations of example TOPCAT queries avoiding the natural join syntax. They are a longer and uglier, which is the general argument for the elegance of natural joins in the first place, but they do work. I’ve proposed to Mark Taylor that we could provide these in TOPCAT and I agreed with his suggestion that while we can certainly use them, we should bring this to the mailing list so you are all aware of the issue. 



How do we want to balance using the more elegant query syntax in our specs yet support major architectures that don’t allow them? Should we avoid using them in documentation examples and/or provide sample translations? What do we add in terms of notes about the issue in future specs? Should we help out clients with an agreed-upon error note for unsupported functionality, given some TAP providers have a similar issue with unsupported geometry already? I think we should do all of this and I welcome thoughts on the issue. 



For the morbidly curious, all of the example queries requiring translation for SQL Server and our substitutions follow. They are quite simple, just numerous. 



--Theresa Dower 



---------------------------------------------- 



TAP Access URLS: 



SELECT ivoid, access_url 

FROM rr.capability 

NATURAL JOIN rr.interface 

WHERE standard_id like 'ivo://ivoa.net/std/tap%' 

AND intf_type='vs:paramhttp' 



SELECT ivoid, access_url 
FROM rr.capability, rr.interface 
WHERE rr.capability.ivoid = rr.interface.ivoid 
AND standard_id like 'ivo://ivoa.net/std/tap%' 
AND intf_type='vs:paramhttp' 



SIA with spirals: 



SELECT ivoid, access_url 

FROM rr.capability 

NATURAL JOIN rr.resource 

NATURAL JOIN rr.interface 

NATURAL JOIN rr.res_subject 

WHERE standard_id='ivo://ivoa.net/std/sia' 

AND intf_type='vs:paramhttp' 

AND ( 

1=ivo_nocasematch(res_subject, '%spiral%') 

OR 1=ivo_hasword(res_description, 'spiral') 

OR 1=ivo_hasword(res_title, 'spiral')) 



SELECT rr.capability.ivoid, access_url 

FROM rr.capability, rr.resource, rr.interface, rr.res_subject 

where rr.capability.ivoid = rr.resource.ivoid 

AND rr.resource.ivoid = rr.interface.ivoid 

AND rr.interface.ivoid = rr.res_subject.ivoid 

AND standard_id='ivo://ivoa.net/std/sia' 

AND intf_type='vs:paramhttp' 

AND ( 

1=ivo_nocasematch(res_subject, '%spiral%') 

OR 1=ivo_hasword(res_description, 'spiral') 

OR 1=ivo_hasword(res_title, 'spiral')) 



Infrared SIA: 



SELECT ivoid, access_url 

FROM rr.capability 

NATURAL JOIN rr.resource 

NATURAL JOIN rr.interface 

WHERE standard_id='ivo://ivoa.net/std/sia' 

AND intf_type='vs:paramhttp' 

AND 1=ivo_hashlist_has('infrared', waveband) 



SELECT rr.capability.ivoid, access_url 
FROM rr.capability, rr.resource, rr.interface 
AND rr.capability.ivoid = rr.resource.ivoid 
AND rr.resource.ivoid = rr.interface.ivoid 
AND standard_id='ivo://ivoa.net/std/sia' 
AND intf_type='vs:paramhttp' 
AND 1=ivo_hashlist_has('infrared', waveband) 



ConeSearch with redshift: 



SELECT ivoid, access_url 

FROM rr.capability 

NATURAL JOIN rr.table_column 

NATURAL JOIN rr.interface 

WHERE standard_id='ivo://ivoa.net/std/conesearch' 

AND intf_type='vs:paramhttp' 

AND ucd='src.redshift' 



SELECT rr.capability.ivoid, access_url 
FROM rr.capability, rr.table_column, rr.interface 
where rr.capability.ivoid = rr.table_column.ivoid 
AND rr.table_column.ivoid = rr.interface.ivoid 
AND standard_id='ivo://ivoa.net/std/conesearch' 
AND intf_type='vs:paramhttp' 
AND ucd='src.redshift' 



Records from registry: 



SELECT access_url 

FROM rr.interface 

NATURAL JOIN rr.capability 

NATURAL JOIN rr.res_detail 

WHERE standard_id='ivo://ivoa.net/std/tap' 

AND intf_type='vs:paramhttp' 

AND detail_xpath='/capability/dataModel/@ivo-id' 

AND 1=ivo_nocasematch(detail_value, 

'ivo://ivoa.net/std/regtap#1.%') 



SELECT ivoid FROM rr.resource 
RIGHT OUTER JOIN ( 
SELECT 'ivo://' || detail_value || '%' AS pat 
FROM rr.res_detail 
WHERE detail_xpath='/managedAuthority' 
AND ivoid='ivo://cds.vizier/registry') 
AS authpatterns 
ON (resource.ivoid LIKE authpatterns.pat) 



Locate RegTAP services: 



SELECT access_url 

FROM rr.interface 

NATURAL JOIN rr.capability 

NATURAL JOIN rr.res_detail 

WHERE standard_id='ivo://ivoa.net/std/tap' 

AND intf_type='vs:paramhttp' 

AND detail_xpath='/capability/dataModel/@ivo-id' 

AND 1=ivo_nocasematch(detail_value, 

'ivo://ivoa.net/std/regtap#1.%') 



SELECT access_url 
FROM rr.interface, rr.capability, rr.res_detail 
where rr.interface.ivoid = rr.capability.ivoid 
AND rr.capability.ivoid = rr.res_detail.ivoid 
AND standard_id='ivo://ivoa.net/std/tap' 
AND intf_type='vs:paramhttp' 
AND detail_xpath='/capability/dataModel/@ivo-id' 
AND 1=ivo_nocasematch(detail_value, 
'ivo://ivoa.net/std/regtap#1.%') 



TAP with Physics: 



SELECT ivoid, access_url, name, ucd, column_description 

FROM rr.capability 

NATURAL JOIN rr.interface 

NATURAL JOIN rr.table_column 

NATURAL JOIN rr.res_table 

WHERE standard_id='ivo://ivoa.net/std/tap' 

AND intf_type='vs:paramhttp' 

AND 1=ivo_hasword(table_description, 'quasar') 

AND ucd='phot.mag;em.opt.v' 



SELECT rr.capability.ivoid, access_url, name, ucd, column_description 
FROM rr.capability, rr.interface, rr.table_column, rr.res_table 
where rr.capability.ivoid = rr.interface.ivoid 
and rr.interface.ivoid = rr.table_column.ivoid 
and rr.table_column.ivoid = rr.res_table.ivoid 
and standard_id='ivo://ivoa.net/std/tap' 
AND intf_type='vs:paramhttp' 
AND 1=ivo_hasword(table_description, 'quasar') 
AND ucd='phot.mag;em.opt.v' 



Theoretical SSA: 



SELECT access_url 

FROM rr.res_detail 

NATURAL JOIN rr.capability 

NATURAL JOIN rr.interface 

WHERE detail_xpath='/capability/dataSource' 

AND intf_type='vs:paramhttp' 

AND standard_id='ivo://ivoa.net/std/ssa' 

AND detail_value='theory' 



SELECT access_url 

FROM rr.res_detail, rr.capability, rr.interface 

where rr.res_detail.ivoid = rr.capability.ivoid 

and rr.capability.ivoid = rr.interface.ivoid 

and detail_xpath='/capability/dataSource' 

AND intf_type='vs:paramhttp' 

AND standard_id='ivo://ivoa.net/std/ssa' 

AND detail_value='theory' 



Find Contact Persons: 



SELECT DISTINCT base_role, role_name, email 

FROM rr.res_role 

NATURAL JOIN rr.interface 

WHERE access_url='http://dc.zah.uni-heidelberg.de/tap' 



SELECT DISTINCT access_url, base_role, role_name, email 
FROM rr.res_role, rr.interface 
where rr.res_role.ivoid = rr.interface.ivoid 
and access_url='http://dc.zah.uni-heidelberg.de/tap' 






This message and any attachments are intended for the use of the addressee or addressees only.
The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its
content is not permitted.
If you received this message in error, please notify the sender and delete it from your system.
Emails can be altered and their integrity cannot be guaranteed by the sender.

Please consider the environment before printing this email.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/registry/attachments/20160224/e8917d12/attachment-0001.html>


More information about the registry mailing list