ADQL-2.1 internal draft

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Thu Jun 11 09:40:00 CEST 2015


Dear DAL,

On Wed, Jun 10, 2015 at 01:59:00PM -0700, Walter Landry wrote:
> Markus Demleitner <msdemlei at ari.uni-heidelberg.de> wrote:
> > On Wed, Jun 10, 2015 at 01:39:34AM -0700, Walter Landry wrote:
> >>   IN_UNIT(M,"solMass") < 10
> > 
> > That would work if M is a mass.  The translation layer would turn
> > this into 
> > 
> > M
> > 
> > if it worked out that M is in solar masses
> 
> Isn't this undecidable when you use a function like pow(x,y)?  You
> have to actually run the query and look at the results before you can
> verify the units.

My original proposal was to *require* in_unit, but require sane
behaviour only on simple column references.  Working on full
expressions would be not be required, but implementors would be free
to try it.  Implementations may or will have to refuse many
constructs anyway; for instance, stuff like in_unit(parallax+magV,
"mas") must be rejected (I admit the current error message in
http://dc.zah.uni-heidelberg.de/__system__/adql/query/form?__nevow_form__=genForm&query=select%20in_unit(pmra%2Bepde%2C%20'mas%2Fyr')%20from%20fk6.fk6join&_TIMEOUT=5&_FORMAT=HTML&submit=Go
needs work, but I guess you get the idea).

pow(x,y) is an example that simply doesn't allow unit annotation.  In
our data model (i.e., VOTable), units are given globally per column,
and thus that expression cannot be reliably given a unit.  That's not
a probleme specific to in_unit; as TAP engines should give the units
in their VOTable output, they should have unit annotations anyway.
If they have that, in_unit isn't terribly expensive.

> >> >> 5) Why are we making new function names BIT_AND, BIT_OR, etc?  Why not
> >> >>    just use the operators?  It is what everyone but Oracle and
> >> >>    Informix implement, and they use different names anyway.
> >> > 
> >> > ...but operators are harder to parse (precedence!  Left-recursive
> >> > rules!), and it's easier for a machine to go from prefix notation to
> >> > infix than the other way round.  I'm all for functions.
> >> 
> >> We have to parse the operators anyway.  And it is really not
> >> that hard.
> > 
> > It's not hard, but it's a bunch of extra rules;
> 
> I am confused.  The spec says that we have to parse operators.  Are
> you suggesting modifying the proposal to get rid of the operators?
> That would be gratuitously incompatible with MS SQL, Postgres, MySQL,
> and SQLite.

But mapping functions to operators is trivial, the other way round
requires actual parsing and grammar rules.

So, yes, I am suggesting to do away with the operators, and I suspect
you will do so, too, when you've started to fiddle in the rules into
the BNF grammar.

Whatever we do, though, I agree there should not be *both* functions
and operators, and furthermore I suggest whatever we go for should be
mandatory.

I'm not at all enthusiastic about the slew of optional features
introduced here anyway -- they're a nightmare for clients, as
figuring out what "effective grammar" a given service supports is
going to become very hard for them.

I believe we should sit together and identify what we actually can
and want to require.  We cannot (or at least should not) forbid
services to accept additional constructs, so extra features on
individual services are always possible; clients and their users,
however, would know what they can rely upon (or at least, when they
can complain to the service providers).

I admit it's nice to have standard names for certain extensions, so
there is some merit in enumerating them.  However, if we make
features with impact on the grammar optional, we're entering a
combinatorial catastrophe.  The grammar impact of features like the
set operators or common table expressions isn't isolated to one
"standard" rule needing a change and a set of extra rules otherwise
ignored; fairly typically several standard rules are impacted.  In the
end, even allowing quite a bit of after-parse logic, the current spec
induces at least eight nontrivially different grammars.

I strongly suspect that would be the end of TOPCAT's nice
syntax-sensitive ADQL input window, and I submit that's an indication
of a problem.

Cheers,

           Markus



More information about the dal mailing list