Problems with RegTAP, ADQL, natural joins, and MS SQL Server
Theresa Dower
dower at stsci.edu
Wed Feb 24 22:01:07 CET 2016
Menelaus,
Thank you for going through these and noting where there are additional cross-compatibility issues. Your changes to the first query (TAP Access URLS) also work through our SQL Server, so either version would be acceptable as far as we’re concerned. And whoops, yes, you caught a paste mistake on the Records from Registry query. I swear I proofread that several times.
As for Markus’ TAP validator, we’re working toward complete validation with that as well, but had started with basic functionality through a tolerant client and thus ran into this issue early on. Congratulations on your upcoming service!
The translation we’re using for the fifth, Records from Registry query is actually:
SELECT ivoid FROM rr.resource
RIGHT OUTER JOIN (
SELECT 'ivo://' || detail_value || '%' AS pat
FROM rr.res_detail
WHERE detail_xpath='/managedAuthority'
AND ivoid='ivo://cds.vizier/registry')
AS authpatterns
ON (resource.ivoid LIKE authpatterns.pat)
To:
SELECT ivoid FROM rr.resource
RIGHT OUTER JOIN (
SELECT 'ivo://' || detail_value || '%' AS pat
FROM rr.res_detail
WHERE detail_xpath='/managedAuthority'
AND ivoid='ivo://cds.vizier/registry')
AS authpatterns
ON (resource.ivoid LIKE authpatterns.pat)
I currently still have a bug in our || handling, so I can’t test it at the moment, but it should work.
--Theresa
From: Menelaus Perdikeas [mailto:mperdikeas at sciops.esa.int]
Sent: Wednesday, February 24, 2016 3:33 PM
To: Theresa Dower
Cc: registry at ivoa.net
Subject: Re: Problems with RegTAP, ADQL, natural joins, and MS SQL Server
Hi Theresa,
I am pretty set in considering NATURAL JOINS an anti-pattern (mostly due to being opaque) and far from being elegant.
Personally, I always use aliases since they are very explicit, precise and reduce redundancy in typing as well.
But I don't want to start a religious discussion.
We too are using Gregory's parser and have encountered a couple of cases where the specific syntax of queries generated by TopCAT seems to not sit well with the library's ADQL parser or our native back-end's (PostgreSQL 9.2) capabilities, so when you contribute your changes back into the project maybe we should try to see if the changes / extensions solve any of our own issues as well. But our benchmark is not TopCAT but rather the RegTAP validation suite by Markus where we effectively score something like 59/61 (as of release r3726 of the suite). We also check that we can handle all the example queries as given in the RegTAP specification.
For what is worth, I tried all your example queries (and their suggested translations) in our beta system <http://registry2.euro-vo.org/eurovo/> (to be pushed to the production URL tomorrow according to the latest estimates):
1nd. sample query (TAP Access URLs)
* natural join works
* non-natural join doesn't work and has to be re-written with aliases as:
SELECT a.ivoid, b.access_url
FROM rr.capability a, rr.interface b
WHERE a.ivoid = b.ivoid
AND a.standard_id like 'ivo://ivoa.net/std/tap%'
AND b.intf_type='vs:paramhttp'
Or, alternatively and much closer to your own version, as:
SELECT rr.capability.ivoid, access_url
FROM rr.capability, rr.interface
WHERE rr.capability.ivoid = rr.interface.ivoid
AND standard_id like 'ivo://ivoa.net/std/tap%'
AND intf_type='vs:paramhttp'
2nd. sample query (SIA with spirals)
* both versions work
3rd sample query (Infrared SIA)
* both version work (if one accounts for some obvious typos that the non-natural version has)
4th sample query (Cone Search with Red Shifts)
* both versions work
5th sample query (records from registry)
* natural query works
! I am not sure the non-natural version of the query corresponds to the same query (semantically) - maybe a copy-paste mistake?
6th sample query (locate RegTAP services)
* both version work but the non-natural join returns 5 records (instead of the 2 that the natural join version does)
7th sample query (TAP with physics)
* natural JOIN version works
* non-natural JOIN returns many more records
8th sample query (Theoretical SSA)
* natural JOIN version works
* non-natural version returns 5 rows instead of two.
9th sample query (Find Contact Persons)
* both versions work
Cheers,
Menelaus.
________________________________
From: "Theresa Dower" <dower at stsci.edu<mailto:dower at stsci.edu>>
To: registry at ivoa.net<mailto:registry at ivoa.net>
Sent: Wednesday, February 24, 2016 9:11:36 PM
Subject: Problems with RegTAP, ADQL, natural joins, and MS SQL Server
Hello,
As some of you know from prior discussions and database work of your own, there is an issue with the ADQL standard and the capabilities of MS SQL Server, namely that by design it does not support the NATURAL JOIN syntax. The design decision was based on the ambiguity in how the natural join finds keys in generic schemas. Even with known schemas such as RegTAP’s, parsing and rewriting generic queries with natural joins is far from trivial.
At this time, TAP services at NAVO institutions and the Royal Observatory Edinburgh are using Microsoft SQL Server as a back end to TAP services. STScI and ROE are extending Gregory Mantele’s Java ADQL parser for this work and will share it back to his project. STScI have put together a RegTAP service which will be operational before the Spring 2016 interop. (An ObsTAP service with basic spatial query support comes next.) The problem is that with our available resources we cannot support natural joins, yet they are used extensively by the example RegTAP queries in TOPCAT, the most commonly used TAP client.
For our testing, Sarah Weissman at ST created translations of example TOPCAT queries avoiding the natural join syntax. They are a longer and uglier, which is the general argument for the elegance of natural joins in the first place, but they do work. I’ve proposed to Mark Taylor that we could provide these in TOPCAT and I agreed with his suggestion that while we can certainly use them, we should bring this to the mailing list so you are all aware of the issue.
How do we want to balance using the more elegant query syntax in our specs yet support major architectures that don’t allow them? Should we avoid using them in documentation examples and/or provide sample translations? What do we add in terms of notes about the issue in future specs? Should we help out clients with an agreed-upon error note for unsupported functionality, given some TAP providers have a similar issue with unsupported geometry already? I think we should do all of this and I welcome thoughts on the issue.
For the morbidly curious, all of the example queries requiring translation for SQL Server and our substitutions follow. They are quite simple, just numerous.
--Theresa Dower
----------------------------------------------
TAP Access URLS:
SELECT ivoid, access_url
FROM rr.capability
NATURAL JOIN rr.interface
WHERE standard_id like 'ivo://ivoa.net/std/tap%'
AND intf_type='vs:paramhttp'
SELECT ivoid, access_url
FROM rr.capability, rr.interface
WHERE rr.capability.ivoid = rr.interface.ivoid
AND standard_id like 'ivo://ivoa.net/std/tap%'
AND intf_type='vs:paramhttp'
SIA with spirals:
SELECT ivoid, access_url
FROM rr.capability
NATURAL JOIN rr.resource
NATURAL JOIN rr.interface
NATURAL JOIN rr.res_subject
WHERE standard_id='ivo://ivoa.net/std/sia'
AND intf_type='vs:paramhttp'
AND (
1=ivo_nocasematch(res_subject, '%spiral%')
OR 1=ivo_hasword(res_description, 'spiral')
OR 1=ivo_hasword(res_title, 'spiral'))
SELECT rr.capability.ivoid, access_url
FROM rr.capability, rr.resource, rr.interface, rr.res_subject
where rr.capability.ivoid = rr.resource.ivoid
AND rr.resource.ivoid = rr.interface.ivoid
AND rr.interface.ivoid = rr.res_subject.ivoid
AND standard_id='ivo://ivoa.net/std/sia'
AND intf_type='vs:paramhttp'
AND (
1=ivo_nocasematch(res_subject, '%spiral%')
OR 1=ivo_hasword(res_description, 'spiral')
OR 1=ivo_hasword(res_title, 'spiral'))
Infrared SIA:
SELECT ivoid, access_url
FROM rr.capability
NATURAL JOIN rr.resource
NATURAL JOIN rr.interface
WHERE standard_id='ivo://ivoa.net/std/sia'
AND intf_type='vs:paramhttp'
AND 1=ivo_hashlist_has('infrared', waveband)
SELECT rr.capability.ivoid, access_url
FROM rr.capability, rr.resource, rr.interface
AND rr.capability.ivoid = rr.resource.ivoid
AND rr.resource.ivoid = rr.interface.ivoid
AND standard_id='ivo://ivoa.net/std/sia'
AND intf_type='vs:paramhttp'
AND 1=ivo_hashlist_has('infrared', waveband)
ConeSearch with redshift:
SELECT ivoid, access_url
FROM rr.capability
NATURAL JOIN rr.table_column
NATURAL JOIN rr.interface
WHERE standard_id='ivo://ivoa.net/std/conesearch'
AND intf_type='vs:paramhttp'
AND ucd='src.redshift'
SELECT rr.capability.ivoid, access_url
FROM rr.capability, rr.table_column, rr.interface
where rr.capability.ivoid = rr.table_column.ivoid
AND rr.table_column.ivoid = rr.interface.ivoid
AND standard_id='ivo://ivoa.net/std/conesearch'
AND intf_type='vs:paramhttp'
AND ucd='src.redshift'
Records from registry:
SELECT access_url
FROM rr.interface
NATURAL JOIN rr.capability
NATURAL JOIN rr.res_detail
WHERE standard_id='ivo://ivoa.net/std/tap'
AND intf_type='vs:paramhttp'
AND detail_xpath='/capability/dataModel/@ivo-id'
AND 1=ivo_nocasematch(detail_value,
'ivo://ivoa.net/std/regtap#1.%')
SELECT ivoid FROM rr.resource
RIGHT OUTER JOIN (
SELECT 'ivo://' || detail_value || '%' AS pat
FROM rr.res_detail
WHERE detail_xpath='/managedAuthority'
AND ivoid='ivo://cds.vizier/registry')
AS authpatterns
ON (resource.ivoid LIKE authpatterns.pat)
Locate RegTAP services:
SELECT access_url
FROM rr.interface
NATURAL JOIN rr.capability
NATURAL JOIN rr.res_detail
WHERE standard_id='ivo://ivoa.net/std/tap'
AND intf_type='vs:paramhttp'
AND detail_xpath='/capability/dataModel/@ivo-id'
AND 1=ivo_nocasematch(detail_value,
'ivo://ivoa.net/std/regtap#1.%')
SELECT access_url
FROM rr.interface, rr.capability, rr.res_detail
where rr.interface.ivoid = rr.capability.ivoid
AND rr.capability.ivoid = rr.res_detail.ivoid
AND standard_id='ivo://ivoa.net/std/tap'
AND intf_type='vs:paramhttp'
AND detail_xpath='/capability/dataModel/@ivo-id'
AND 1=ivo_nocasematch(detail_value,
'ivo://ivoa.net/std/regtap#1.%')
TAP with Physics:
SELECT ivoid, access_url, name, ucd, column_description
FROM rr.capability
NATURAL JOIN rr.interface
NATURAL JOIN rr.table_column
NATURAL JOIN rr.res_table
WHERE standard_id='ivo://ivoa.net/std/tap'
AND intf_type='vs:paramhttp'
AND 1=ivo_hasword(table_description, 'quasar')
AND ucd='phot.mag;em.opt.v'
SELECT rr.capability.ivoid, access_url, name, ucd, column_description
FROM rr.capability, rr.interface, rr.table_column, rr.res_table
where rr.capability.ivoid = rr.interface.ivoid
and rr.interface.ivoid = rr.table_column.ivoid
and rr.table_column.ivoid = rr.res_table.ivoid
and standard_id='ivo://ivoa.net/std/tap'
AND intf_type='vs:paramhttp'
AND 1=ivo_hasword(table_description, 'quasar')
AND ucd='phot.mag;em.opt.v'
Theoretical SSA:
SELECT access_url
FROM rr.res_detail
NATURAL JOIN rr.capability
NATURAL JOIN rr.interface
WHERE detail_xpath='/capability/dataSource'
AND intf_type='vs:paramhttp'
AND standard_id='ivo://ivoa.net/std/ssa'
AND detail_value='theory'
SELECT access_url
FROM rr.res_detail, rr.capability, rr.interface
where rr.res_detail.ivoid = rr.capability.ivoid
and rr.capability.ivoid = rr.interface.ivoid
and detail_xpath='/capability/dataSource'
AND intf_type='vs:paramhttp'
AND standard_id='ivo://ivoa.net/std/ssa'
AND detail_value='theory'
Find Contact Persons:
SELECT DISTINCT base_role, role_name, email
FROM rr.res_role
NATURAL JOIN rr.interface
WHERE access_url='http://dc.zah.uni-heidelberg.de/tap'
SELECT DISTINCT access_url, base_role, role_name, email
FROM rr.res_role, rr.interface
where rr.res_role.ivoid = rr.interface.ivoid
and access_url='http://dc.zah.uni-heidelberg.de/tap'
This message and any attachments are intended for the use of the addressee or addressees only.
The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its
content is not permitted.
If you received this message in error, please notify the sender and delete it from your system.
Emails can be altered and their integrity cannot be guaranteed by the sender.
Please consider the environment before printing this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/registry/attachments/20160224/e7cee219/attachment-0001.html>
More information about the registry
mailing list