new UWS 1.1 WD

Tue Oct 7 16:37:59 CEST 2014

On 2014-10 -07, at 14:45, Mark Taylor <M.B.Taylor at bristol.ac.uk> wrote:

> Paul & grid,
> 
> On Tue, 30 Sep 2014, Paul Harrison wrote:
> 
>> I have uploaded a new version of the UWS 1.1 WD to http://www.ivoa.net/documents/UWS/20140930/ which contains changes which are largely as a result of the discussion that happened in this thread http://mail.ivoa.net/pipermail/grid/2014-June/002609.html
>> 
>> In summary the main change from the previous version is that the blocking behaviour introduced in that version has been moved from a specific custom endpoint to being signalled my a ?WAIT query parameter on the /{jobs}/{job-id} endpoint.
> 
> I still think there is potential for a race condition here.
> 
> Consider this sequence of events:
> 
>   1. Client requests status from server
>   2. Server returns status: it's QUEUED
>   3. Server changes status to EXECUTING
>   4. Client makes blocking call to server to find out when status changes
>   5.   ... wait ...
>   6. Server changes status to COMPLETED
>   7. Blocking call returns, client finds out that status is COMPLETED
> 
> As far as the client is concerned, the job transitions from QUEUED
> straight to COMPLETED, it never sees the (potentially long-lived)
> EXECUTING phase.  With the existing REST API there's nothing the
> client can do to reliably avoid this (well, it could asychronously
> issue another non-blocking status request after the blocking one
> has started to check it hasn't changed, but that's (a) messy and
> (b) not bulletproof since you don't know when the blocking call
> actually starts blocking, i.e. how long to wait after the start
> of the blocking call before you do it).
> 
> Now this won't happen very often, it's only in the unlucky case that the
> server changes phase just after the client makes the status request.
> Also, the consequences are not disastrous, since the terminal phases
> (here COMPLETED) doesn't block.  Maybe for those reasons we don't
> care enough to do anything about it.  But, it's not Right.

OK - I agree with you that it is not Right if you want to monitor PHASEs before COMPLETED, and I had not really considered this use case - though the WAIT idea does mitigate this to some extent in that  presumably if the client is interested in informing the user about intermediate stages on a regular basis, then it would probably issue a WAIT=10 (or some other small number) which is guaranteed to return after the 10 seconds, whether there has been a PHASE change or not. However, this still does not guarantee the earliest possible notice of a change of PHASE in the situation you describe.

> 
> To avoid the problem you need an an atomic operation that both
> determines current phase and requests blocking until phase changes.
> (An analogous issue is why in Java you are only allowed to call
> Object.wait() if you have synchronized on that object).
> 
> One possibility would be for the Job document to include a blocking
> URL that anybody is allowed to call to find out when the status
> changes from the status reported in that document (if it's not
> obvious how that could be implemented, I can provide a sketch).
> Another is what I suggested when I first emailed about this issue
> in relation to the previous WD here:
> http://mail.ivoa.net/pipermail/grid/2014-June/002609.html

I think that I prefer your previous suggestion as a solution to this - it seems to be pretty simple for both the client and the server - though it seems to me that it would be better to have the naming of the parameter logic reversed compared with your example as that seems clearer to me.

/{jobs}/{jobid}?WAIT=20&PHASE=QUEUED

will only block if the PHASE is QUEUED (or whatever is specified) otherwise will return immediately. 

Does this do what you need?

Cheers,
	Paul.