making TAP /async optional.

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon May 19 23:42:18 PDT 2014


Hi,

On Mon, May 19, 2014 at 10:14:03PM +0000, Paul Harrison wrote:
> 
> On 2014-05 -19, at 17:35, Laurent MICHEL <laurent.michel at astro.unistra.fr> wrote:
> > The /async mode is justified by the necessity of enabling TAP
> > servers to process queries consuming a lot of time or resources.
> > The problem is that the decision of using one mode or another is
> > taken on the client/user side. That supposes the client or the
> > user to have an idea about the resources taken by the query or
> > simply to be aware about that issue.
> > Considering that even query engines hardly manage to predict the
> > order of magnitude of the query processing duration, I believe
> > that giving the choice to users is not really helpful.
>
> I think that you are correct that it is not easy for the client to
> make the choice between using the synchronous and asynchronous
> modes  - The UWS specification had a pattern for implementing a
> synchronous service on top of an asynchronous basis - which is
> similar to your /autosync suggestion
> http://www.ivoa.net/documents/UWS/20101010/REC-UWS-1.0-20101010.html#SynchronousService

The trouble with autosync -- or indeed sync redirecting to an async
job resource -- is that in practice, the choice isn't so much about
expectations of runtime but of resource limits (row limits, run
times, etc).

Most importantly, in several implementations, async goes through a
queue, and sync does not.  

sync starting an async job at least makes things complicated; the job
would probably be created in SUSPENDED (or so) so as to avoid it do
jump the queue (ahem, now that I think of it: how do you suspend a
database query in, say, postgres?).  But then it can't be manipulated
(raising executionDuration and similar)... it's complications like
this that make me doubt whether this can be made to work without
hacks.

So, let me play defender of the status quo here -- I believe the
current situation isn't so bad (and maybe should be communicated more
clearly to our users and implementors):

* Use sync if you want immediate results if possible (but you'll get
  an error instead of a result if your results aren't basically
  immediate)
* Use async if you want almost guaranteed results (but you may have
  to wait a fairly long time for them).


Cheers,

        Markus



More information about the dal mailing list