ADQL polymorphic functions

Mark Taylor m.b.taylor at bristol.ac.uk
Thu Apr 22 18:44:59 CEST 2021


Markus,

I'm going to start by missing the point a bit:

On Thu, 22 Apr 2021, Markus Demleitner wrote:

> In the run-up to VODataService's RegTAP mapping (which will,
> according to my plan, feature spectral coverage in J), and also to
> make obscore more easily usable across the spectrum, I've implemented
> a user-defined function for converting between the various spectral
> units (cf. http://dc.g-vo.org/tap/capabilities, look for
> gavo_specconv).  
> 
> For instance, you can say
> 
>   SELECT access_url, gavo_specconv(em_min, 'keV') as en_min
>   FROM ivoa.obscore
>   WHERE 1=CONTAINS(s_region, CIRCLE(30, 20, 5))
>     AND gavo_specconv(1, 'keV', 'm') BETWEEN em_min and em_max
> 
> (this will run on http://dc.g-vo.org/tap). 
> 
> I'll make a sales pitch for this some other day, as I think this is
> an excellent candidate for the UDF catalogue.
> 
> This time, however, my problem is of a different nature: As you can
> see in the example, the UDF has two forms: a two-parameter one for
> when the machinery can figure out the unit of the argument itself,
> and a three-argument one when the user has to (or wants to) provide a
> unit literal.

A function with an optional argument *in between* two mandatory arguments
looks ... quite surprising to me.  It's a sufficiently strange/confusing
pattern that as a user I think I'd appreciate having two different
named UDFs for this rather than one name overloaded with two
different signatures.  I admit that I don't have great ideas
for non-stupidly-verbose names in this case, but maybe it would just
be better to have a single 3-parameter version and allow an empty
string for the middle parameter.

> Current TAPRegExt doesn't really deal with this.
> 
> Of course, one could just define two UDFs, like this
> 
>   <feature>
>     <form>gavo_specconv(expr DOUBLE PRECISION, dest_unit TEXT) 
>         -&gt; DOUBLE PRECISION</form>
>       <description>returns the spectral value expr converted to dest_unit.
>         The unit of expr is derived from expr's components...
>       </description>
>   </feature>
> 
>   <feature>
>     <form>gavo_specconv(expr DOUBLE PRECISION, expr_unit TEXT, dest_unit TEXT) 
>         -&gt; DOUBLE PRECISION</form>
>       <description>returns the spectral value expr converted to dest_unit.
>         expr is assumed to be given in expr_unit...
>       </description>
>   </feature>
> 
> This has the advantage that it doesn't require any standards work.
> It has the disadvantage that it increases the number of UDFs users
> have to scan (in TOPCAT's service tab, say), and this might become
> really ugly when functions are highly polymorphous (think: three
> optional parameters each accepting three types would explode into
> nine feature elements with, presumably, almost identical content.

Is that 9-way overload really something you can imagine?
Given that the type system is not specified for UDF syntax
(so you could just use e.g. NUMBER as a type) I can't think of
functions that *should* exist in very many overloaded forms
(though I'm not saying it's beyond human ingenuity to construct some).
I'd be interested to see realistic examples.

Given those comments, the above option of listing the UDF multiple
times in multiple features elements doesn't sound like it would be
too burdensome in practice.

But if there really is a problem here that needs solving:

> An interesting compromise would be to allow multiple form elements;
> this would retain the ease of parsing with not having to repeat
> possibly verbose descriptions, like this:
> 
>   <feature>
>     <form>gavo_specconv(expr DOUBLE PRECISION, dest_unit TEXT) 
>         -&gt; DOUBLE PRECISION</form>
>     <form>gavo_specconv(expr DOUBLE PRECISION, expr_unit TEXT, dest_unit TEXT) 
>         -&gt; DOUBLE PRECISION</form>
>       <description>returns the spectral value expr converted to dest_unit.
>         expr is assumed to be given in expr_unit...
>       </description>
>   </feature>
> 
> -- this would be my favourite, except it needs schema changes in
> TAPRegExt, and I'd not bet on how well existing capabilities parsers
> would cope with this.  Conversely, I think some of the other features
> could profit from allowing multiple form-s, too.  Hm.

this one seems quite reasonable.  You can also imagine listing multiple
UDFs with different names alongside each other in this way if it
was semantically convenient to combine their documentation together.
But, yes, it's a standards change.

Mark

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk          http://www.star.bristol.ac.uk/~mbt/


More information about the dal mailing list