table_name syntax
Mark Taylor
m.b.taylor at bristol.ac.uk
Thu May 14 11:34:42 CEST 2015
On Thu, 30 Apr 2015, Mark Taylor wrote:
> On Thu, 30 Apr 2015, Markus Demleitner wrote:
>
> > Then, there was Mark responding to Pat (Tue, 28 Apr 2015 17:36:06
> > +0100):
> >
> > >> PPS- tap-lint does a test where it quotes a column name it finds in
> > >> a table in the tap_schema. I noted 2-4 query failures per day in
> > >> our logs caused by this test. It broke because in postgresql it
> > >> subverts the way case-insensitive table and column names are
> > >> implemented (PG just converts everything to lower unless you quote
> > >> them!). My "fix" was simply to strip the quotes in that case
> > >> because I Know they can only break it. While the test is probably
> > >> technically correct, it is really just asking for trouble :-)
> > >
> > > hmm - my emerging understanding from this discussion is that
> > > that taplint test is probably technically *incorrrect*,
> > > and needs changing. Can we have a ruling from Markus
> > > (as our resident delimited-identifier-gotcha expert) on that?
> >
> > I've not actually looked at what the test really does, but the way
> > Pat describes it, TOPCAT probably shouldn't be doing it. If you go
> > back to the artwork of delimited identifier prose I sent around
> > yesterday, you'll see that no normalization takes place in delimited
> > identifiers. So, "FooBar" will indeed be distinct from "Foobar".
> >
> > Regular identifiers, however, are normalised. Postgres in spirit
> > does the right thing by just having them all lower-cased, so, even if
> > one puts FooBar (undelimited) into TAP_SCHEMA, you'd have to use
> > "foobar" to match it with a delimited identifier.
> >
> > That, however, will fail with standard SQL1992, which instead
> > stipulates that "every letter that is a lower-case letter [has to be]
> > replaced by the equivalent upper-case letter or letters" on comparing
> > regular and delimited identifiers. So, while for aestethic reasons I
> > fully support Postgres' decision to lower-case, if we followed
> > SQL1992 here, ADQL translators to Postgres would have a whole lot of
> > fun. So, I propose leaving the entire question of casiness in
> > comparisons between regular and delimited identifier open (which
> > right now it kinda is; you could say by default to sql1992 it's not,
> > in which case we're all doing it wrong).
> >
> > Hence I think STILTS would be prudent not to fuzz with quotes.
> > Services that mess up delimited identifiers in their TAP_SCHEMAS will
> > then get validation errors, which serves them right for using them in
> > the first place. With the above... shall we say, clarifications,
> > that's actually correct (if necessary, I'll ask my lawyer friends to
> > come up with a reason why they're just clarifications).
>
> Thanks. Without necessarily claiming I've followed the details here,
> it looks like it would indeed be best for taplint to take table and
> column names just as they are supplied by the TAP_SCHEMA or VOSI
> metadata, and never add or strip quotes. I will make that change
> for future releases.
Done. A pre-release taplint is available:
ftp://andromeda.star.bris.ac.uk/pub/star/stilts/pre/stilts.jar
that no longer messes around with column names by adding quotes.
It also has new tests to ensure that table and column names
from TAP_SCHEMA and the /tables endpoint are (regular or delimited)
ADQL identifers.
Future topcat releases will also work under similar assumptions
about how TAP services report their table metadata.
Mark
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/
More information about the dal
mailing list