table_name syntax

Mark Taylor m.b.taylor at bristol.ac.uk
Thu May 14 11:34:42 CEST 2015


On Thu, 30 Apr 2015, Mark Taylor wrote:

> On Thu, 30 Apr 2015, Markus Demleitner wrote:
> 
> > Then, there was Mark responding to Pat (Tue, 28 Apr 2015 17:36:06
> > +0100):
> > 
> > >> PPS- tap-lint does a test where it quotes a column name it finds in
> > >> a table in the tap_schema. I noted 2-4 query failures per day in
> > >> our logs caused by this test. It broke because in postgresql it
> > >> subverts the way case-insensitive table and column names are
> > >> implemented (PG just converts everything to lower unless you quote
> > >> them!). My "fix" was simply to strip the quotes in that case
> > >> because I Know they can only break it. While the test is probably
> > >> technically correct, it is really just asking for trouble :-)
> > >
> > > hmm - my emerging understanding from this discussion is that
> > > that taplint test is probably technically *incorrrect*,
> > > and needs changing.  Can we have a ruling from Markus
> > > (as our resident delimited-identifier-gotcha expert) on that?
> > 
> > I've not actually looked at what the test really does, but the way
> > Pat describes it, TOPCAT probably shouldn't be doing it.  If you go
> > back to the artwork of delimited identifier prose I sent around
> > yesterday, you'll see that no normalization takes place in delimited
> > identifiers.  So, "FooBar" will indeed be distinct from "Foobar".
> > 
> > Regular identifiers, however, are normalised.  Postgres in spirit
> > does the right thing by just having them all lower-cased, so, even if
> > one puts FooBar (undelimited) into TAP_SCHEMA, you'd have to use
> > "foobar" to match it with a delimited identifier.
> > 
> > That, however, will fail with standard SQL1992, which instead
> > stipulates that "every letter that is a lower-case letter [has to be]
> > replaced by the equivalent upper-case letter or letters" on comparing
> > regular and delimited identifiers.  So, while for aestethic reasons I
> > fully support Postgres' decision to lower-case, if we followed
> > SQL1992 here, ADQL translators to Postgres would have a whole lot of
> > fun.  So, I propose leaving the entire question of casiness in
> > comparisons between regular and delimited identifier open (which
> > right now it kinda is; you could say by default to sql1992 it's not,
> > in which case we're all doing it wrong).
> > 
> > Hence I think STILTS would be prudent not to fuzz with quotes.
> > Services that mess up delimited identifiers in their TAP_SCHEMAS will
> > then get validation errors, which serves them right for using them in
> > the first place.  With the above... shall we say, clarifications,
> > that's actually correct (if necessary, I'll ask my lawyer friends to
> > come up with a reason why they're just clarifications).
> 
> Thanks.  Without necessarily claiming I've followed the details here,
> it looks like it would indeed be best for taplint to take table and
> column names just as they are supplied by the TAP_SCHEMA or VOSI
> metadata, and never add or strip quotes.  I will make that change
> for future releases.

Done.  A pre-release taplint is available:

   ftp://andromeda.star.bris.ac.uk/pub/star/stilts/pre/stilts.jar

that no longer messes around with column names by adding quotes.
It also has new tests to ensure that table and column names
from TAP_SCHEMA and the /tables endpoint are (regular or delimited)
ADQL identifers.

Future topcat releases will also work under similar assumptions
about how TAP services report their table metadata.

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/


More information about the dal mailing list