<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">Hi Pat,<br>
      <br>
      ADQL does not specify anything about collation.<br>
      <br>
      According to my experience, DBMS support several collations so
      that the DB manager can choose which one is needed on specific
      tables. <br>
      <br>
      SQL-92 does not seem to enforce one in particular ; it seems to
      say that the default collation (collating sequence) is
      implementation dependent. Based on that, I don't think ADQL should
      define a default collation.<br>
      <br>
      Besides, generally the databases published in TAP (or any other
      ADQL services) already exist, and generally either with a default
      collation or with a specific collation required for a precise
      usage in the DB. So, we certainly do not want to make implementer
      change their tables' collation, which generally means rebuild the
      table and indexes....quite an important task if the table is
      large.<br>
      <br>
      However, maybe a collation should be recommended on some
      standardized tables/schemas like 'obscore' and 'registry' so that
      ensuring having a common behavior when querying them ; so, for a
      better interoperability. But that would be up to the corresponding
      IVOA standards/documents, I think.<br>
      <br>
      Tell me if I missed something.<br>
      <br>
      Cheers,<br>
      Grégory<br>
      <br>
      <br>
      On 24/02/2021 22:33, Patrick Dowler wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAFK8nrpyoXXXSS7DBVA4p=zs7Uvp95AdEyPQwmoMvP3r24PHmw@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div class="gmail_default" style="font-size:small"><br>
        </div>
        <div class="gmail_default" style="font-size:small">I have just
          discovered (to my horror) that some of our databases sort in
          different order than others depending on the LC_COLLATE
          value...  specifically thus is noticeable with underscore
          which comes before letters with C but after letters with en_CA
          or en_US... ouch.</div>
        <div class="gmail_default" style="font-size:small"><br>
        </div>
        <div class="gmail_default" style="font-size:small">Does ADQL
          specify this? Which locale (?) is assumed to be correct for
          ADQL? <br>
        </div>
        <div class="gmail_default" style="font-size:small"><br>
        </div>
        <div class="gmail_default" style="font-size:small"><br>
        </div>
        <div class="gmail_default" style="font-size:small">gory details:</div>
        <div class="gmail_default" style="font-size:small"><br>
        </div>
        <div class="gmail_default" style="font-size:small">in java,
          underscore comes before letters:<br>
        </div>
        <div class="gmail_default" style="font-size:small">groovy:000&gt;
          "FOO_bar".compareTo("FOObar");<br>
          ===&gt; -3</div>
        <div class="gmail_default" style="font-size:small">groovy:000&gt;
          "FOObar".compareTo("FOO_bar");<br>
          ===&gt; 3</div>
        <div class="gmail_default" style="font-size:small">so java is
          equiv to C even when system locale is en_CA<br>
        </div>
        <div class="gmail_default" style="font-size:small"><br>
        </div>
        <div class="gmail_default" style="font-size:small">In PG server
          1:</div>
        <div class="gmail_default" style="font-size:small">cvomaqdb=&gt;
          show lc_collate;<br>
           lc_collate  <br>
          -------------<br>
           en_US.UTF-8<br>
          <br>
          cvomaqdb=&gt; select 'FOO_bar' &lt; 'FOObar';<br>
           f<br>
          <br>
          cvomaqdb=&gt; select 'FOObar' &lt; 'FOO_bar';<br>
           t</div>
        <div>
          <div dir="ltr" class="gmail_signature"
            data-smartmail="gmail_signature">
            <div dir="ltr">
              <div>
                <div dir="ltr">
                  <div>
                    <div><br>
                    </div>
                    <div>
                      <div style="font-size:small" class="gmail_default">In
                        PG server 2:</div>
                      <div style="font-size:small" class="gmail_default">content=&gt;
                        show lc_collate;<br>
                         lc_collate <br>
                        ------------<br>
                         C<br>
                        <br>
                        content=&gt; select 'FOO_bar' &lt; 'FOObar';<br>
                         t<br>
                        <br>
                        content=&gt; select 'FOObar' &lt; 'FOO_bar';<br>
                         f<br>
                      </div>
                      <br>
                    </div>
                    <div><br>
                    </div>
                    <div>--<br>
                    </div>
                    <div>Patrick Dowler<br>
                    </div>
                    Canadian Astronomy Data Centre<br>
                  </div>
                  Victoria, BC, Canada<br>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>