ADQL and ORDER BY
Grégory Mantelet
gregory.mantelet at astro.unistra.fr
Thu Feb 25 10:45:46 CET 2021
Hi Pat,
ADQL does not specify anything about collation.
According to my experience, DBMS support several collations so that the
DB manager can choose which one is needed on specific tables.
SQL-92 does not seem to enforce one in particular ; it seems to say that
the default collation (collating sequence) is implementation dependent.
Based on that, I don't think ADQL should define a default collation.
Besides, generally the databases published in TAP (or any other ADQL
services) already exist, and generally either with a default collation
or with a specific collation required for a precise usage in the DB. So,
we certainly do not want to make implementer change their tables'
collation, which generally means rebuild the table and indexes....quite
an important task if the table is large.
However, maybe a collation should be recommended on some standardized
tables/schemas like 'obscore' and 'registry' so that ensuring having a
common behavior when querying them ; so, for a better interoperability.
But that would be up to the corresponding IVOA standards/documents, I think.
Tell me if I missed something.
Cheers,
Grégory
On 24/02/2021 22:33, Patrick Dowler wrote:
>
> I have just discovered (to my horror) that some of our databases sort
> in different order than others depending on the LC_COLLATE value...
> specifically thus is noticeable with underscore which comes before
> letters with C but after letters with en_CA or en_US... ouch.
>
> Does ADQL specify this? Which locale (?) is assumed to be correct for
> ADQL?
>
>
> gory details:
>
> in java, underscore comes before letters:
> groovy:000> "FOO_bar".compareTo("FOObar");
> ===> -3
> groovy:000> "FOObar".compareTo("FOO_bar");
> ===> 3
> so java is equiv to C even when system locale is en_CA
>
> In PG server 1:
> cvomaqdb=> show lc_collate;
> lc_collate
> -------------
> en_US.UTF-8
>
> cvomaqdb=> select 'FOO_bar' < 'FOObar';
> f
>
> cvomaqdb=> select 'FOObar' < 'FOO_bar';
> t
>
> In PG server 2:
> content=> show lc_collate;
> lc_collate
> ------------
> C
>
> content=> select 'FOO_bar' < 'FOObar';
> t
>
> content=> select 'FOObar' < 'FOO_bar';
> f
>
>
> --
> Patrick Dowler
> Canadian Astronomy Data Centre
> Victoria, BC, Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20210225/d7c0a0fc/attachment.html>
More information about the dal
mailing list