[SIAv2] upload

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Wed Jan 22 11:42:42 PST 2014


current draft: http://www.ivoa.net/documents/SIA/20131115/

This email is about the content of Section 2.17 of the above WD.


One of the early CSP initiatives was for all services to support the 
upload of lists of values to be used in queries; while this is typically 
lists of coordinates but not restricted to that. I feel this is a 
mandate/requirement to include this in the SIA-2.0 query capability.


** Background:

Note: the < > characters below are not literally included in any 
strings; they just denote some variable content and I don't want to use 
examples.

DALI says how to upload a resource for use in a job:

        UPLOAD=<name>,<uri>

where other parameters would use the <name> to refer to the table and 
<uri> gives the location of the table data (either inline or some URL).

In TAP, the other parameter is QUERY and it refers to the table as

       TAP_UPLOAD.<name>

and the columns in the normal way that the query language refers to 
columns. TAP specifically says the table is VOTable and the name 
attribute of the FIELD is the column name.

SIAv2 query parameters are multi-valued, so you can already pass in 
multiple positions or energy ranges and have them all used in the query. 
The upload feature is for cases when a table of values already exists -- 
quite likely as a result from a previous query and (as you can do with 
TAP async) a result from a  previous query that the client has not even 
downloaded... so the use cases where upload is special is usually larger 
scale and/or orchestrating multiple services.


In the SIAv2 {query} resource, one can upload a table as above, but now 
we need to refer to the table (and maybe the columns) in the query 
parameters, eg POS, BAND, TIME, POL.

** Some possibilities pulled out of thin air:

Magically get positional constraints out of the table:
      POS=<name>
- would work in simple contrived cases
- table might not contain any identifiable position data
- table might contain multiple/ambiguous position data

Get specific constraints out of a column:
      POS=<name>.<field.name>
- would resolve the ambiguity, but assumes a single field contains the 
complete position data (no multiple columns with RA and DEC, for example)

Get specific constraints out of multiple columns:
      POS=CIRCLE <name>.<ra column> <name>.<dec column> <name>.<radius 
column>
- <name>.<radius column> could be replaced by a numeric constant
- typical tables would not be feasibly usable with any other geometric 
value types, but circle is > 95% of the use cases anyway

Would <name> and <field.name> have to be marked up so we know they are 
symbols or is just the fact that they are not numbers good enough? We 
are essentially using variable names here so is it shell-script-like or 
programming-language-like? Obviously, if a parameter was to acccept 
input strings we would have to differentiate between a value and a 
reference to a table/column, so that argues for symbol markup.

Although TAP ends up defining the use of the symbolic table name in a 
way that requires constructing and parsing strings, in that case it is a 
normal part of using the available query language in the normal way. I 
chose the same sort of thing above (sans the TAP_UPLOAD schema) but I'm 
not so convinced myself that it isn't an unnatural mess.


Thoughts? Better ideas? Worse ideas? :-)


  --

Patrick Dowler
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9A 2L9

250-363-0044 (office) 250-363-0045 (fax)


More information about the dal mailing list