Asynchronous querying and tabular data

Wed May 2 12:51:18 PDT 2007

Good - sounds to me like we are most of the way there.

The async query operation both executes a query and stages the data,
so I think either name would work, and stageData is more consistent
with the other interfaces.  But the operation could have a different
name for TAP if we feel this is important.  The more important thing
is that it does essentially the same thing as the other versions;
probably almost everything could be common except for the table-specifc
content of the "stageData" request.

Based on these discussions, what we have at this point for a
simple-as-possible interface is something like:

     o	queryData.  Synchronous data queries against a single
 	table or tableSet.  Could also be used to query metadata
 	if we wish, by querying SCHEMA.tables and SCHEMA.columns
 	(following the information schema concept, but omitting most
 	of it and putting our own custom metadata in these tables).
 	Alternatively, separate methods could be used for metadata
 	queries. In either case, a TAP metadata query could be used
 	to generate table metadata to support registry queries.

     o	getData.  Just an access reference (URL currently) as
 	elsewhere; so far in TAP this is only needed to retrieve data
 	from an async query.

     o	stageData (or queryDataAsync etc.).  Executes a query
 	asynchronously, staging the output table in a local or remote
 	VOSpace.  Standard UWS-like mechanisms (polling, messaging) can
 	be used to monitor the progress of a job once execution begins.
 	When the job completes, an acref can be returned to the client,
 	or the table can be delivered directly to the client's VOSpace.

     o	getCapabilities - standard
     o	getAvailability - standard

I think there is still some work to be done (beyond what VOResource
defines) to define tableset and table/column metadata to support
complex queries against large tables, however I recognize that others
don't necessarily agree with this.

Does this sound like it would work?

 	- Doug

On Wed, 2 May 2007, Patrick Dowler wrote:

> On Wednesday 02 May 2007 11:07, Doug Tody wrote:
>> A single service could support both: queryData for synchronous DM and
>> ADQL-based queries, and optionally stageData for asyn/staged execution.
>> The client would then either have to guess which to use, or try a
>> few smaller synchronous queries first to determine what to do, and
>> then resubmit a larger query as a batch job.
>
> I said earlier that I think this is what we need (single step sync and async
> querying methods). For what it's worth, I think they both need the word query
> in the name, for the simple reason that this is what people new to the API
> will look for. For example, when you first learn the JDBC API, you poke
> around aimlessly until you find the Statement interface and the method:
>
> ResultSet executeQuery(String sql)
>
> That is where you start. Then you learn about Connection and DriverManager
> (maybe DataSource) and ResultSet... but you grok it when you see that one
> method signature.
>
> ** I think it is really important for TAP to have this kind of very clear
> focal point.  **
>
>
>
> PS-Sure, JDBC is a nightmare of bad design otherwise, but once someone finds
> that method signature they can proceed from there and get something working
> quite quickly.
>
>