<div dir="ltr"><div class="gmail_default" style="font-size:small">I agree that table_name cannot feasibly be modified from whatever the provider specified and shouldn't have been in that list.</div><div class="gmail_default" style="font-size:small">It's a bug ... erratum.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">You might even note/warn that if the table_name (and column_name(s) elsewhere?) are delimited the RegTAP implementor has to take care to preserve the delimiters as well, if that's not stated already.</div><div class="gmail_default" style="font-size:small"><br clear="all"></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div>--<br></div><div>Patrick Dowler<br></div>Canadian Astronomy Data Centre<br></div>Victoria, BC, Canada<br></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 5 Aug 2021 at 05:15, Markus Demleitner <<a href="mailto:msdemlei@ari.uni-heidelberg.de">msdemlei@ari.uni-heidelberg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear Registry,<br>
<br>
ADQL delimited identifiers are a pain.<br>
<br>
My old mantra just became underscored right now while I was adding a<br>
get_tables method to pyVO's registry interface. That pulls table<br>
metadata from rr.res_table and rr.table_column. And breaks hard on<br>
VizieR.<br>
<br>
You see, RegTAP in sect. 8.6 says:<br>
<br>
The following columns MUST be lowercased during ingestion: ivoid,<br>
table_name, table_type, table_utype.<br>
<br>
In the presence of SQL delimited identifiers, blindly lowercasing<br>
table_name is wrong and actively breaks things. For instance, VizieR<br>
has a table J/A+A/437/789/table2; once that's lowercased, it requires<br>
a lot of VizieR knowledge to recover that form.<br>
<br>
But to write a query you need that exact form because to get away<br>
with that alphabet soup as a table name, you have to use delimited<br>
identifiers in ADQL queries, and delimited identifiers prevent the<br>
(about as unfortunate) case folding SQL has almost everywhere else.<br>
<br>
Hence, rr.res_table is close to useless with the largest TAP service<br>
on the planet. The only reason nobody noticed is that most everyone<br>
is turning to GloTS for this kind of thing. But as you know I've<br>
been trying to wean people off the non-standard GloTS for ages, at<br>
one of these days it'll have to happen.<br>
<br>
I can't quite say any more why I would have required case<br>
normalisation in table_name; I probably figured "it's machine<br>
readable, and SQLs folds case, so let's be consistent with almost<br>
everything else machine-readable in RegTAP".<br>
<br>
I suppose if I wrote this again I'd say "lowercase regular<br>
identifiers, leave SQL delimited identifiers alone". However, that<br>
still wouldn't fix things right now because VizieR doesn't include<br>
quotes in their VODataService table names just yet.<br>
<br>
But I'd like to have a fix about now in order to make my get_tables()<br>
method useful when the pyvo PR goes in. And the baroque<br>
normalisation rules of SQL are a bit too much for an Erratum anyway<br>
IMHO.<br>
<br>
Also, we are now a lot more explicit that the content of table/name<br>
must be exactly as it should be used in TAP, and so there's really no<br>
reason for any more normalisation. Hence... How would people feel if<br>
I wrote a RegTAP erratum to the effect:<br>
<br>
In section 8.6, from the sentence: “The following columns MUST be<br>
lowercased during ingestion: ivoid, table_name, table_type,<br>
table_utype.” remove “table_name”. That is, with this erratum,<br>
table_name is ingested literally.<br>
<br>
I think that won't break anything operationally; if anyone had relied<br>
on these table names, they'd have noticed it's broken with many<br>
VizieR tables, I suppose. And less case folding is always a gain.<br>
Of course, I'll provide lots of rationale and impact assessment, but<br>
before going into the trouble, I'd be grateful for encouraging and/or<br>
dissenting voices.<br>
<br>
Thanks,<br>
<br>
Markus<br>
<br>
PS: I'm taking the liberty of trying this out on <a href="http://dc.g-vo.org" rel="noreferrer" target="_blank">dc.g-vo.org</a> and then<br>
<a href="http://reg.g-vo.org" rel="noreferrer" target="_blank">reg.g-vo.org</a> starting in an hour or so; that's how confident I am<br>
that this is a bona fide improvement with very little risk of<br>
breaking things.<br>
</blockquote></div>