[SIAv2] upload

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Mar 11 05:02:43 PDT 2014


Dear DAL,

Sorry for being stubborn here, but I maintain this whole issue is
crucial for robust services with client-discoverable interfaces.

The TL;DR is again: 

(1) Our service parameters should follow VOTable's type model
(2) Clients must be able to judge what literals are valid inputs to a
parameter
(3) Clients must be able to help the user against the horror vacui of
empty form fields (i.e., state ranges or enumeration values for each
server)

If you don't agree with any of this, shout now, since my arguments of
course break down without these premises.

On Mon, Mar 10, 2014 at 03:08:20PM -0700, Patrick Dowler wrote:
> 
> 
> On 10/03/14 12:20 AM, Markus Demleitner wrote:
 >>POS=CIRCLE $foo.RA $foo.DEC $foo.rad
> >
> >Ouch.  It's this kind of thing that makes me lobby against complex
> >service parameters.  Please, let's keep them atomic.  For this kind
> >of thing, it's perfectly ok, simpler, and equivalent in expressive
> >power to have
> >
> >RA_REF=ra DEC_REF=dec SR=0.01
> 
> I did clearly say there are two approaches to using "symbols" and I
> opted for the easier to handle shell-script-like $symbol. What you
> give here is an arbitrary string that the service has to figure out
> to be the name of a column (somewhere) and not a value... that is a
> much harder lexical analysis problem. So: $ sign or lexical analysis?

No -- it's about a separate parameter of type "reference to column".
I probably should have written RA_REF=ra&DEC_REF=dec&SR=0.01
In C, you'd have

double RA;
double *RA_REF;

-- and it's a good thing that, given C's processing model, a compiler
will complain if you try

RA = RA_REF

and vice versa.

In my proposal, a service can declare:

<PARAM name="RA" datatype="double precision">
  <VALUES><MIN value="300"/></VALUES>
</PARAM>

-- which IMHO solves points (1)-(3) above.

When we suddenly allow references in there, the entire declaration
would be wrong -- it'd have to be datatype="char" arraysize="*", and
all the nice and useful metadata would have to go.

> Someone might be tempted to say "well, ra is a string and that param
> expects numbers, so clearly it is a symbol"... but not all parameters
> take numbers (POL) and values for that parameter could also be in the
> uploaded table.

Well, for POL good metadata is even more important, as it is an
enumerated value:

<PARAM name="POL" datatype="char" arraysize="*">
  <VALUES>
    <OPTION value="I"/>
    <OPTION value="Q"/>
  </VALUES>
</PARAM>

Even if in this case you could claim that the legal strings here are
fixed by the standard, it's still highly useful to state which of the
14 obscore-defined strings will yield results in a given service (and
my crystal ball clearly shows service operators ending up stuffing
more values in there anyway).

> Unambiguous: POL=$blah

...but inconsistent with the metadata above, which I'd really like to
have.

Unambiguous:

<PARAM name="POL_REF" datatype="char" arraysize="*"/>

(no VALUES, as these come from the uploaded table unknown to the
service).

> As for specifying the table in $foo.blah, the reason is that while we
> might only think about or ant a single uloaded resource now, in
> future we might want more so we *must* say which uploaded resource we
> mean.

If this turns out to be necessary, it's a new feature and should have
a new parameter, rather than have some syntax that implementations
will fairly certainly get wrong as long as it's not used, which will
then blow up once it is.

<soapbox>
There's nothing wrong with having many parameters in a complex
interface if it helps (1)..(3) -- for the service implementor, the
features need implementing anyway, whether input values are parsed
out of alphabet soup or come pre-parsed from the HTTP library, for
the client implementor it's actually much easier to not have to come
up with the alphabet soup in the first place and just pass through
values from the UI to its HTTP library.
</soapbox>


> >While I'm all for dynamic typing, it only works if you can reliably
> >serialize typed values (e.g., "s/34.1" is a string, "d/34.1" is a
> >float").  In DAL parameters, we can't, so static typing is what we
> >have to do if we want typing at all, and I, for one, want typing.
> 
> Why not use the same parameter for a constant value or a reference?
> We all do that every day in code:
> 
> value: POL=Q
> 
> reference: POL=$p

In dynamically typed languages, you can do something like this; as I
said, their *values* show their types.  VOTable and everything in the
VO doesn't have that.

In such statically typed circumstance, I'm fairly sure you're not
doing anything like

double a, b;
a = 5;
b = &a;

Neither should we in our service interfaces.

> atomic" but it really just means that we skipped having to do a
> string split on whitespace at the expense of outlawing any other
> shape. Really? We're scared of splitting a string on whitespace? By
> skipping it, we also introduce more param names, several extra error

No, I'm scared of having to describe to a client what kinds of things
are allowed between the spaces, how many of them are there, and so
on.

> The it gets worse: define a new set of params for a range. Then
> another set of params for a polygon. Oh, and we can't use the
> s_region column in ObsCore* in any queries either, so we can never
> find data that overlaps other data... sigh.

All these are separate features.  Requiring extra parameters to
support them gives clients a fighting chance to figure out if a given
server supports that feature or not.  That is an extremely good
thing alleviating many problems we try to handle using versioning
with only limited success.


> * In the very near future, I would like to see ADQL conform to simple
> geometry values as well, which would fix a plethora of issues with

Yes!  Yes!  Yes!

> PS-I am not crazy about constructing a geometry value on-the-fly, but
> it is what falls out of solving all the other problems in the
> simplest way. The solution has to be "as simple as possible, but no
> simpler" :-)

Right.  But if you agree with (1)..(3) above and then start figuring
out how metadata for your compound, polymorphic parameters would look
like... shiver.

Again, apologies for being obnoxious, but I do think experience from
past protocols with alphabet soup parameters has shown we
shouldn't have started with them, and we certainly should stop now.


Thanks for your patience, all around,

          Markus



More information about the dal mailing list