new UWS 1.1 WD

Paul Harrison paul.harrison at manchester.ac.uk
Tue Oct 7 16:37:59 CEST 2014


On 2014-10 -07, at 14:45, Mark Taylor <M.B.Taylor at bristol.ac.uk> wrote:

> Paul & grid,
> 
> On Tue, 30 Sep 2014, Paul Harrison wrote:
> 
>> I have uploaded a new version of the UWS 1.1 WD to http://www.ivoa.net/documents/UWS/20140930/ which contains changes which are largely as a result of the discussion that happened in this thread http://mail.ivoa.net/pipermail/grid/2014-June/002609.html
>> 
>> In summary the main change from the previous version is that the blocking behaviour introduced in that version has been moved from a specific custom endpoint to being signalled my a ?WAIT query parameter on the /{jobs}/{job-id} endpoint.
> 
> I still think there is potential for a race condition here.
> 
> Consider this sequence of events:
> 
>   1. Client requests status from server
>   2. Server returns status: it's QUEUED
>   3. Server changes status to EXECUTING
>   4. Client makes blocking call to server to find out when status changes
>   5.   ... wait ...
>   6. Server changes status to COMPLETED
>   7. Blocking call returns, client finds out that status is COMPLETED
> 
> As far as the client is concerned, the job transitions from QUEUED
> straight to COMPLETED, it never sees the (potentially long-lived)
> EXECUTING phase.  With the existing REST API there's nothing the
> client can do to reliably avoid this (well, it could asychronously
> issue another non-blocking status request after the blocking one
> has started to check it hasn't changed, but that's (a) messy and
> (b) not bulletproof since you don't know when the blocking call
> actually starts blocking, i.e. how long to wait after the start
> of the blocking call before you do it).
> 
> Now this won't happen very often, it's only in the unlucky case that the
> server changes phase just after the client makes the status request.
> Also, the consequences are not disastrous, since the terminal phases
> (here COMPLETED) doesn't block.  Maybe for those reasons we don't
> care enough to do anything about it.  But, it's not Right.

OK - I agree with you that it is not Right if you want to monitor PHASEs before COMPLETED, and I had not really considered this use case - though the WAIT idea does mitigate this to some extent in that  presumably if the client is interested in informing the user about intermediate stages on a regular basis, then it would probably issue a WAIT=10 (or some other small number) which is guaranteed to return after the 10 seconds, whether there has been a PHASE change or not. However, this still does not guarantee the earliest possible notice of a change of PHASE in the situation you describe.

> 
> To avoid the problem you need an an atomic operation that both
> determines current phase and requests blocking until phase changes.
> (An analogous issue is why in Java you are only allowed to call
> Object.wait() if you have synchronized on that object).
> 
> One possibility would be for the Job document to include a blocking
> URL that anybody is allowed to call to find out when the status
> changes from the status reported in that document (if it's not
> obvious how that could be implemented, I can provide a sketch).
> Another is what I suggested when I first emailed about this issue
> in relation to the previous WD here:
> http://mail.ivoa.net/pipermail/grid/2014-June/002609.html

I think that I prefer your previous suggestion as a solution to this - it seems to be pretty simple for both the client and the server - though it seems to me that it would be better to have the naming of the parameter logic reversed compared with your example as that seems clearer to me.

/{jobs}/{jobid}?WAIT=20&PHASE=QUEUED

will only block if the PHASE is QUEUED (or whatever is specified) otherwise will return immediately. 

Does this do what you need?

Cheers,
	Paul.




More information about the grid mailing list