ADQL and ORDER BY

Grégory Mantelet gregory.mantelet at astro.unistra.fr
Thu Feb 25 10:45:46 CET 2021


Hi Pat,

ADQL does not specify anything about collation.

According to my experience, DBMS support several collations so that the 
DB manager can choose which one is needed on specific tables.

SQL-92 does not seem to enforce one in particular ; it seems to say that 
the default collation (collating sequence) is implementation dependent. 
Based on that, I don't think ADQL should define a default collation.

Besides, generally the databases published in TAP (or any other ADQL 
services) already exist, and generally either with a default collation 
or with a specific collation required for a precise usage in the DB. So, 
we certainly do not want to make implementer change their tables' 
collation, which generally means rebuild the table and indexes....quite 
an important task if the table is large.

However, maybe a collation should be recommended on some standardized 
tables/schemas like 'obscore' and 'registry' so that ensuring having a 
common behavior when querying them ; so, for a better interoperability. 
But that would be up to the corresponding IVOA standards/documents, I think.

Tell me if I missed something.

Cheers,
Grégory


On 24/02/2021 22:33, Patrick Dowler wrote:
>
> I have just discovered (to my horror) that some of our databases sort 
> in different order than others depending on the LC_COLLATE value...  
> specifically thus is noticeable with underscore which comes before 
> letters with C but after letters with en_CA or en_US... ouch.
>
> Does ADQL specify this? Which locale (?) is assumed to be correct for 
> ADQL?
>
>
> gory details:
>
> in java, underscore comes before letters:
> groovy:000> "FOO_bar".compareTo("FOObar");
> ===> -3
> groovy:000> "FOObar".compareTo("FOO_bar");
> ===> 3
> so java is equiv to C even when system locale is en_CA
>
> In PG server 1:
> cvomaqdb=> show lc_collate;
>  lc_collate
> -------------
>  en_US.UTF-8
>
> cvomaqdb=> select 'FOO_bar' < 'FOObar';
>  f
>
> cvomaqdb=> select 'FOObar' < 'FOO_bar';
>  t
>
> In PG server 2:
> content=> show lc_collate;
>  lc_collate
> ------------
>  C
>
> content=> select 'FOO_bar' < 'FOObar';
>  t
>
> content=> select 'FOObar' < 'FOO_bar';
>  f
>
>
> --
> Patrick Dowler
> Canadian Astronomy Data Centre
> Victoria, BC, Canada

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20210225/d7c0a0fc/attachment.html>


More information about the dal mailing list