ADQL-2.1 PR now available

Thu Jan 25 11:22:34 CET 2018

Hi DAL,

I'm doing one reply for a couple of points here.  I suspect any
ongoing discussion should start different threads for the different
points.

On Wed, Jan 24, 2018 at 10:35:21PM +0000, Mark Taylor wrote:
> On Mon, 15 Jan 2018, Marco Molinaro wrote:
> > http://ivoa.net/Documents/ADQL/20180112/index.html
> sec 4.2.15:
>     Since the deprecated, optional coordsys parameter has been
>     excised from all the examples, the COORDSYS function is now
>     rather pointless, and in particular the example in this section:
> 
>         COORDSYS(
>             POINT(
>                 25.0,
>                -19.5
>                )
>            )
> 
>     looks a bit silly or incomprehensible.  I'd suggest using an example
>     with a non-empty coordsys argument to POINT (e.g. as in the v2.00
>     document), but note again that this usage is deprecated.

Well, yes.  But I'm unhappy with 4.2.15 (the section defining COOSYS)
anyway.  First: 

  "Details of the coordinate system for a database column are
  available as part of the service metadata, available via the
  TAP_SCHEMA tables defined in the TAP specification and the /tables
  webservice response defined in the VOSI specification."

That's not true -- neither currently has a means of communicating
machine-readable information on coordinate systems, and teaching it
to them would at least double the specification length for either of
them.  

I'd say (and DaCHS does this[1]) that coordinate system metadata
simply is part of the response.  This currently means that in VOTable
reponses, people should include COOSYS, and there was STC-in-VOTable
(which was almost universally ignored), and perhaps some day there'll
be STC2 and the mapping spec. 

Given the depressing state that STC in the VO still is in, I'd say
ADQL shouldn't add any further confusion and just say:

  4.2.15 COORDSYS

  COORDSYS was intended to let clients extract a coordinate system
  identifier from a geometry.  Implementors SHOULD, where available,
  still return the coordinate system identifier when the geometry in
  its argument was constructed with a meaningful one and NULL in all
  other cases.  Clients are advised to avoid the use of this
  function.  

  This function is marked as deprecated by this specification and may
  be removed in a future version.

> sec 4.3.1:
>     A reserved namespace is proposed:
> 
>        "The ivo prefix is reserved for functions that have been
>         defined in an IVOA specification."
> 
>     I think that's not a bad idea, but I believe that in
>     at least some cases current practice violates it; the DaCHS
>     service at http://dc.g-vo.org/tap defines e.g., ivo_healpix_center,
>     ivo_interval_overlaps, ivo_healpix_index, ivo_apply_pm.
>     I've heard Markus defend that practice elsewhere, so I guess
>     he should comment.

Well... As usual we don't really have a process for how to phase in
these things, and "IVOA specification" isn't a well defined term
either (WD?  PR?  REC?  Note?).

The way I'd like this to work is: If two or more implementors agree
on a function pattern (which was the case for the healpix functions,
but admittedly not for apply_pm), they'd use ivo_ and ideally put up
their proposal for discussion here.

In particular, I don't think it should be a validity criterion that a
service doesn't have any ivo_ UDFs that aren't an a REC.

Here's my rationale:  It's true that in an ideal world, a feature
would start its existence as gavo_healpix_center and (say)
cadc_healpix_center, will be tested and evaluated based on this, and
eventually some common standard will develop.  Then, everyone
switches, the existing *_healpix_center vanish, and
ivo_healpix_center rules supreme.

But not even in the HTML/CSS community, many orders of magnitude larger
and better funded than we are, does this work terribly well.  I keep
being amazed how many people still use -moz- extensions in CSS rules
even though standard equivalents have been out for ages, and the
browser makers can't get rid of their custom properties.  Confusion
reigns.  Mozilla's take on this
(https://developer.mozilla.org/en-US/docs/Glossary/Vendor_Prefix):

  Browser vendors are working to stop using vendor prefixes for
  experimental features. Web developers have been using them on
  production Web sites, despite their experimental nature. This has
  made it more difficult for browser vendors to ensure compatibility
  and to work on new features; it's also been harmful to smaller
  browsers who wind up forced to add other browsers' prefixes in
  order to load popular web sites.

So, I'd say we should trust the operator community and informally say
"come on, if you want to have a standardised UDF, post something on
the DAL list and forge some rough agreement, then just go ahead and
implement your ivo_ UDF at as many places as possible.  Try to put up
at least a Note describing your consensus."

I think the current text is marginally compatible with that pragmatic
reading.  If we could provide a few more hints that that's what we'd
like to see, I'd not protest.

> sec 4.3.2:
>     This section is repeating text from TAPRegExt (would it be better
>     just to reference it?), but the context in this document raises

I'm all for having it in just one place.  With my TAPRegExt author
hat on, I'm officially offering to reference ADQL (where this kind of
thing arguably belongs, as it's a language feature *of ADQL*) from
there rather than the other way round.  

I don't care much, though, and I'm happy to keep in in TAPRegExt as
well.  There *is* one reason to keep it there I can see: The type of
the language feature is

  ivo://ivoa.net/std/TAPRegExt#features-udf

and we can't really change this string in a minor version.  Now,
there as been a tacit agreement that StandardKeys can only enter a
standard's record if it's mentioned in the standard itself.  This
would mean that ADQL has no business defining around in TAPRegExt's
registry record.

However, there's no standard that codifies that tacit agreement, and
I've always wanted to disagree.  Indeed, StandardsRegExt itself says:

  While keys can be defined as part of a vstd:Standard or
  vstd:ServiceStandard resource, the vstd:StandardKeyEnumeration
  allows a set of key definitions to stand as a resource on its own,
  regardless of whether it is part of a documented standard or not.

When I wrote TAPRegExt, I would have much preferred to have several
of the keys in the records of the standards they actually pertain to,
i.e., TAP or ADQL.  Back then, we didn't do that.  Had we, this would
be easier.

But I'd say let's learn from this experience and just let ADQL define
terms in TAPRegExt's standard record.  We could say something like:

  For historical reasons, the feature type of UDFs is defined in
  TAPRegExt's registry record.  They are still a feature of ADQL.
  Other SQL-based query languages are advised to adopt the feature
  type, as we have tried to design it to be generic enough for them,
  too.

in the ADQL spec.

>     a couple of points.
>     First:
> 
>         arglist ::= "(" <arg> { "," <arg> } ")"
> 
>     I think should (according to the notation defined in sec 2 of
>     this document) read:
> 
>         arglist ::= "(" <arg> { "," <arg> }... ")"
> 
>     That copied text is not *necessarily* a mistake in the TAPRegExt
>     document, since TAPRegExt doesn't define its BNF notation.

Yeah -- although I think the SQL92 flavour of EBNF, with curly braces
and the ellipsis is confusing (in most EBNFs the curly braces mean
"zero or more" by themselves), we can't really move away from SQL92's
customs in ADQL.  Let's do as Mark says.

>     Second:
> 
>         <form>match(pattern TEXT, string TEXT) -> INTEGER</form>
> 
>     having gone to the trouble of defining types in sec 3, shouldn't
>     they be used, at least by way of example, rather than the
>     undefined token "TEXT" here?  Though I'm not sure I've grasped
>     the purpose of the type system, so I may be missing the point.

+1 on this (even though I spit out TEXT there left and right).  I
promise to fix my definitions once ADQL defines what should really be
there.

> 
> sec 4.7.1:
>        "The CAST() function returns the value of the first argument
>         converted to the datatype specified by the second argument."
> 
>     I can't tell from this *how* you are supposed to specify the
>     datatype, e.g. is it quoted or unquoted?  CAST doesn't appear
>     in the BNF either, so I can't find out from that.

Uh-huu.  If CAST doesn't have a BNF rule, I'm against including it in
the standard.  Has anyone implemented that and can donate the
grammar?  If not, let's throw it out on grounds of lack of
implementation experience.

> sec 4.8.1:
>     The PDF includes the text:
> 
>        "... formatting defined in the VOUnits specification (Derriere
>         and Gray et al. (2014))."
> 
>     but in the HTML it looks like:
> 
>        "... formatting defined in the ."
> 
>     I haven't checked if there are other similar instances of
>     missing references in the HTML.

That's because TTH -- the thing ivoatex uses to generate HTML --
doesn't interpret \usepackage.  Dave depended on that in his
(relatively nifty) ivoa-cite trick, and so that hasn't worked when
generating HTML.

I've fixed the problem (yay! it's been far too long since I last used
TeX's expandafter!), but I've found that xspace (that guesses when
space-gobbling after control sequences is inconvenient) is a bit too
hard for TTH.  So, that's gone for now, and I've fixed the
(relatively few) places in the spec at which that mattered.  I'll still
try to make ivoatex xspace-ready, as I'd not veto having something
like ivoa-cite in ivoatex (if someone does the care, feeding, and
documentation for it).

Anyway: it's volute rev. 4708.

> Appendix A:
>     The <trig_function> production includes the COT function, which
>     is not mentioned in sec 2.3.
> 
>     CAST and IN_UNIT appear in the BNF only as reserved words,
>     no clue to the syntax - are they supposed to be there, or
>     are they omitted because they're optional?

I seem to remember and agreement that the BNF would include all
optional parts, and while I'm still unhappy about having anything
optional in the grammar in the first place, I think that's the
minimum we need to protect our users' sanities.

Here's the grammar I'm using for IN_UNIT:

inUnitFunction ::= "IN_UNIT" 
  '(' numericValueExpression
  ',' characterStringLiteral ')'

and the inUnitFunction literal is in numericValueFunction like this:

numericValueFunction ::= trigFunction 
			| mathFunction 
			| miscFunction
			| inUnitFunction
			| userDefinedFunction 
			| numericGeometryFunction 

[poke me to make me put it in like this]

But since you mention it: ceterum censeo optional parts of the
grammar should be marked in the BNF in some way.

             -- Markus

[1] See, for instance,

http://dc.g-vo.org/tap/sync?QUERY=SELECT+TOP+1+*+FROM+ppmxl.main&LANG=ADQL

-- the only official thing in the STC metadata department in there is
COOSYS (and the references to it from the various FIELDs including PM
and such); there's still STC-in-VOTable stuff in there, which
structurally shouldn't be too far from what a future STC2+mapping STC
annotation might look like (although the details will be dramatically
different).  But I've essentially given up on this particular
representation, so don't bother doing anything with it any more.