[SIAv2] upload

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Mon Mar 10 15:08:20 PDT 2014



On 10/03/14 12:20 AM, Markus Demleitner wrote:

> So: No, let's not allow STC-S columns to specify constraints from
> uploaded tables, whatever else we do.

There is no STC-S in the query parameters, as agreed at the Hawaii 
interop. There are plain geometry values (circle, range, polygon) with 
no coordinate system or reference frames of any sort embedded in them. 
This is all in the current WD-SIA-2.0 from November 2013.

All that is proposed here is a way to refer to the columns of the 
uploaded table in the other query parameters. The mechanism we adopt has 
to work for all query parameters, not just spatial position.

>> So, with my CADC hat on, I would say that query params need to refer
>> to the columns in the uploaded table. We should limit SIA-2.0 to a
>> single UPLOAD of a VOTable for simplicity. I would favour a
>> shell-script-like syntax to identify columns, eg:
>>
>> UPLOAD=foo,http;//example.com/mytable.xml
>>
>> POS=CIRCLE $foo.RA $foo.DEC $foo.rad
>
> Ouch.  It's this kind of thing that makes me lobby against complex
> service parameters.  Please, let's keep them atomic.  For this kind
> of thing, it's perfectly ok, simpler, and equivalent in expressive
> power to have
>
> RA_REF=ra DEC_REF=dec SR=0.01

I did clearly say there are two approaches to using "symbols" and I 
opted for the easier to handle shell-script-like $symbol. What you give 
here is an arbitrary string that the service has to figure out to be the 
name of a column (somewhere) and not a value... that is a much harder 
lexical analysis problem. So: $ sign or lexical analysis?

Someone might be tempted to say "well, ra is a string and that param 
expects numbers, so clearly it is a symbol"... but not all parameters 
take numbers (POL) and values for that parameter could also be in the 
uploaded table.

Unambiguous: POL=$blah

Ambiguous: POL=Q

Is that the value Q or the column named Q? What if it was

POL=X

Is that the value X (a mistake) or the column X? Which error message 
will you give the caller?

As for specifying the table in $foo.blah, the reason is that while we 
might only think about or ant a single uloaded resource now, in future 
we might want more so we *must* say which uploaded resource we mean.

Again, ObsTAP use cases require this so users are already wanting to do 
this... not in the core SIAv2 use cases per se, but previous CSP use 
cases can't be thrown out.



> While I'm all for dynamic typing, it only works if you can reliably
> serialize typed values (e.g., "s/34.1" is a string, "d/34.1" is a
> float").  In DAL parameters, we can't, so static typing is what we
> have to do if we want typing at all, and I, for one, want typing.

Why not use the same parameter for a constant value or a reference? We 
all do that every day in code:

value: POL=Q

reference: POL=$p


> This means: parameters have one type.  RA is a float, and POL is an
> enumeration.  Neither type can be constructed from a string like
> "$foo.pol_states", and if we still require services to accept them,
> we must make them lie in the PARAM in the metadata response (or
> suddenly say RA is a string, and people have every right to delight
> us with 15h22m34.55s).  We all know from Asimov's Robot stories what
> happens when you make computers lie...
>
> Let's have POL_REF for that and keep POL clean.

User calls a web service with invalid input... news at 11!

Well, we can argue the merits of this mini-parser vs that mini-parser, 
but it exists no matter how you slice it. The concept of a 
region-of-interest is not a simple scalar atomic value. We can pretend 
that 3 separate params to define a circle is "simple and atomic" but it 
really just means that we skipped having to do a string split on 
whitespace at the expense of outlawing any other shape. Really? We're 
scared of splitting a string on whitespace? By skipping it, we also 
introduce more param names, several extra error messages or inconsistent 
behaviour when things are partially specified, etc.

The it gets worse: define a new set of params for a range. Then another 
set of params for a polygon. Oh, and we can't use the s_region column in 
ObsCore* in any queries either, so we can never find data that overlaps 
other data... sigh.


* In the very near future, I would like to see ADQL conform to simple 
geometry values as well, which would fix a plethora of issues with 
ADQL+TAP and imply that ObsCore returns these same kinds of values. 
Until then, since the content of that column is not actually subject to 
any recommended specification, implementations probably have to be 
tolerant. That isn't hard.


PS-I am not crazy about constructing a geometry value on-the-fly, but it 
is what falls out of solving all the other problems in the simplest way. 
The solution has to be "as simple as possible, but no simpler" :-)

-- 

Patrick Dowler
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9E 2E7

250-363-0044 (office) 250-363-0045 (fax)


More information about the dal mailing list