ADQL-2.1 internal draft

Thu Jun 11 10:46:13 CEST 2015

Hi all,

2015-06-11 0:05 GMT+02:00 Walter Landry <wlandry at caltech.edu>:
> Marco Molinaro <molinaro at oats.inaf.it> wrote:
>> 2015-06-10 1:53 GMT+02:00 Walter Landry <wlandry at caltech.edu>:
>>> Marco Molinaro <molinaro at oats.inaf.it> wrote:
>>>    The functions have different arity, so it is easy to distinguish
>>>    between them in the parser.  In general, empty strings feel odd to me.
>>
>> It could be odd, but this solution was taken upon back-compatibility
>> constraints.
>> That's why from ADQL-2.1 on, until a major revision, the first string,
>> if not empty can be ignored by servers
>> and client are encouraged to pass an empty one. I.e. that parameter is
>> deprecated from revision 2.1 on.
>
> I am not suggesting removing the 2.0 version with a coordinate system.
> I am suggesting adding an overload and not having an implicit meaning
> for an empty string.  That would retain the property of the current
> proposal that all 2.0 queries are valid in 2.1, but not all 2.1
> queries are valid in 2.0.

but while function overload is managed by postgresql, it's not in MySQL -e.g.-.
Now, it seems everyone is parsing and rewriting ADQL queries anyway, so this
can be seen as a weak reason to avoid it.
I don't feel it a light addition, however, introducing the overloading
concept in
an ADQL minor revision.
Again, I feel this point to be more sustainable in a major revision
(even if at that point
we can simply discard the current function signature).

> <snip>
>
>> Pasting here also the other two points you made, i.e. LOWER/UPPER and ILIKE.
>> Probably there was not full discussion on them.
>>
>> LOWER/UPPER Initially they were set as optional for this revision, but
>> there was also the point made that it would be better to have them
>> mandatory...and also to have only one of them to help with tables
>> indexing.
>> Probably this is something to discuss.
>
> If anything we should be normalizing to upper case.  There are some
> letters that do not round trip properly through lower case.

<cut>

it can be true also the other way around (scharfes S, e.g., even if
unicode has an uppercase letter for it), but again I don't think the
intent in adding these functions was to support all encodings and
character sets, it pointed to correctly manage comparison for things
like UCDs and Utypes.
That's why ASCII was considered to be enough.

>
> This still does not specify whether it is UTF-8, UTF-16, or UCS-32.  I
> think we should just choose one, with my vote being UTF-8 since ASCII
> is unchanged.

there was already a topic on VOTable regarding UTF-8 and encodings.
I don't have a specific position here. I only have a preference for
keeping things simple.
If you (all) say we better specify an encoding and that is UTF-8, we
can probably put it in the spec.
If you feel these functions should be restricted in their usage, than
we have to express this in the document clearly.

Personally I don't think the specific examples above (or similar)
apply in the intended comparisons these newly introduced functions
were meant for.

Cheers,
     Marco