table_name syntax
Dave Morris
dave.morris at metagrid.co.uk
Tue Apr 28 17:26:19 CEST 2015
Hi Markus,
Comments inline ..
On 2015-04-28 08:14, Markus Demleitner wrote:
>
> [.. Lots of permuatations of putting quotes in various places ...]
>
>> Which makes it hard to search for table names that start with a
>> specific
>> pattern,
>>
>> SELECT .. WHERE schema_name = 'twomass' AND table_name LIKE 'ps%'
>>
>> would only match one of the possible combinations.
>
> True. And IMHO a strong suggestions we should discourage the use of
> delimited identifiers in general and in table names in particular.
> And then essentially say: "If you absolutely must have them, this is
> what to do". Which would include: Always use regular identifiers in
> lower case (i.e., in particular: no quotes in the literals) unless
> you absolutely have no choice, in which chase put quotes around all
> literals, including quoting inner quotes.
>
"The following section describes aspects of the language that the
authors of this document would prefer you not to use" - !?
One of our biggest data providers has a huge number of tables and
columns, almost all of which will need quoted identifiers.
So however much we would prefer not to, we need to find a way to solve
this issue in a reliable and robust way.
The problem with searching for table name starting with 'foo%' is not
just due to quoted/unquoted, it is also due to qualified/unqualified
names.
Likewise for a client that needs to build the schema->table->column
tree, the combination of quoted/unquoted and qualified/unqualified in
the table_name makes it hard to process.
Providing the raw values in separate columns makes searching for
matching names simple.
SELECT .. WHERE raw_schema_name = 'twomass' AND raw_table_name LIKE
'ps%'
SELECT .. WHERE raw_schema_name = 'viz4' AND raw_table_name LIKE
'B/avo.rad%'
No need to discourage anything.
The existing table_name column provides the 'ready for use'
quoted/unquoted, qualified/unqualified names.
The raw columns provide the values needed for text searches and for
building the tree of schema->table->column.
The raw columns would also enable an external verification service to
check that the 'ready for use' values have been correctly formatted.
>
> -1 from me. I think it's fine if we mandate regular identifiers
> without quotes and delimited identifiers with quotes. People doing
> discovery probably lavishly add wildcards anyway, and if we, as I
> think we should, regard delimited identifiers as desperate emergency
> measures, we should not complicate our data structures just to
> accomodate them.
>
As a client trying to analyse the data and build the tree.
Please, just provide the raw, unqualified, unquoted names in separate
columns.
>
>> Note - the wording of the current TAP specification for
>> TAP_SCHEMA.schemas,
>> TAP_SCHEMA.tables, and TAP_SCHEMA.columns says 'must include the
>> following
>> columns'. It does not exclude the tables from having additional
>> columns. So
>> adding new columns to the new version of the specification will not
>> break
>> existing implementations.
>
> Right -- so, for VizieR it may make sense to add raw_* columns. For
> other services, clients couldn't count on them anyway, so I don't
> think they'd be helped a lot by this addition.
>
If the raw name columns are added to the TAP-1.1 specification, and a
service advertises a TAP-1.1 capability, the client should be able to
rely on the raw names being there.
The 'ready to use' qualified/unqualified, quoted/unquoted values in
table_name are ideal for a client that just wants to get a simple string
that they can use in a query without further processing. Which is what I
think Mark is looking for.
In answer to Mark's questions - yes, the 'ready to use' values should be
quoted both to enclose special characters and to avoid clashes with ADQL
reserved words.
However, the resulting permutations and combinations makes it hard for a
client that needs the raw names to recover them from the complex string.
Our service needs to build the tree of schema->table->column, which
means we need the raw names.
Reverse engineering the raw names from the mixed qualified/unqualified,
quoted/unquoted string is messy.
> I could see a regulation "If you use delimited identifiers a lot,
> consider adding these columns to TAP_SCHEMA," so particularly
> diligent clients would know where to look (TAP_SCHEMA is described in
> TAP_SCHEMA...).
>
https://www.youtube.com/watch?v=pXw7LYWNi5E
Please, just provide the raw, unqualified, unquoted names in separate
columns.
However rare they are, however much we try to discourage their use, if
the specification allows the different combinations of quoted/unquoted
qualified/unqualified, we still end up having to write code that
untangles them to get at the original raw values.
The world would be a much simpler place if TAP_SCHEMA just included the
raw values as well.
Thanks,
Dave
--------
Dave Morris
Software Developer
Wide Field Astronomy Unit
Institute for Astronomy
University of Edinburgh
--------
More information about the dal
mailing list