UWS 1.1 working draft

Paul Harrison paul.harrison at manchester.ac.uk
Tue Jun 3 23:58:43 PDT 2014


Hi Mark/Pat

We certainly could expand the blocking behaviour beyond what is in the WD - I kept it reasonably conservative (i.e. the change of the EXECUTING phase) as I felt that was the general consensus of the meeting - At a minimum I think we should adopt Mark’s idea if we want to restrict the blocking to phase, but allow for different phase transitions to block.

We could go further and do as Pat/Dave say and allow the endpoint (though we should change the blocking URL from {jobid}/blockingphase to something with a more general end path segment if adopted) to block until the UWS deems that something “interesting” has happened, and I agree that in this case it would be better to redirect to the job URL as the response so that the client could detect what has changed (to allow for the partial results use case).

I think that we could avoid the race conditions that Mark is worried about with some explicit rule on the server that it never blocks for one of the terminal phases (as Pat said below). Additionally saying that the server must return always when phase changes (but can return sooner) should address Mark’s concerns I think.

It seems to me a pretty sensible suggestion to make this full blocking generalisation  - my only concern is really that we might be missing something that does not allow this to be a “1.1” release - i.e. breaks backward compatibility. If no-one comes up with any objections I will make the changes to the WD.

Paul.

On 2014-06 -04, at 00:17, Mark Taylor <M.B.Taylor at bristol.ac.uk> wrote:

> Pat,
> 
> I'm fine with the sentiment of this: blocking poll returns whenever
> anything changes, and leaves you to go and find out the new state
> of the job.  But it's hard to make it robust against race conditions
> unless the pre-change behaviour is defined implicitly (as in the text
> from the current WD) or explicitly (as in my suggestion).
> 
> If the behaviour is simply, as you say:
> 
>> - block until something in the job changes (any phase change, addition of
>> result(s), etc) and then redirects (to the job url)
> 
> then to, e.g., wait for a change you will have to do something like:
> 
>    do {
>        phase = read(job-url/phase)
>        displayToUser(phase)
>        waitFor(job-url/blocking-phase)
>    } while ( isNotTerminal(phase) )
> 
> but the trouble is that between the invocations:
> 
>   phase=read(job-url/phase) 
> 
> and
> 
>   waitFor(job-url/blocking-phase)
> 
> the phase might have changed and you'd never know, so you could
> miss a transition, or in the worst case (e.g. if it went from
> EXECUTING to COMPLETED between those calls) be sat there for ever,
> or at least until timeout, waiting for a change that would never
> happen.
> 
> Mark
> 
> On Tue, 3 Jun 2014, Patrick Dowler wrote:
> 
>> 
>> When Dave and I were talking about this on the bus, we came to the conclusion
>> that the block behaviour could be quite simple and more general:
>> 
>> - block until something in the job changes (any phase change, addition of
>> result(s), etc) and then redirects (to the job url)
>> 
>> - if the job is in a state where it cannot change (one of the terminal phases)
>> then the resource no longer blocks, immediate redirect
>> 
>> 
>> I think this way you don't really need anything extra at all. The client does
>> need to check the job to see exactly what changed, but they can just get the
>> phase if that is what they care about.
>> 
>> The client is "notified" of every state change, but we don't ever convey the
>> change itself in the notification (typical event-handler patterns try to do
>> that but the payload continually changes as you adapt to new use cases; I
>> think we need to avoid that trap). For example, say we decided to add (or
>> allow as an extension) a progress indicator in the job; an implementer could
>> unblock whenever the progress indicator changed - the payload doesn't have to
>> change and the client could decide to get the phase, the progress indicator,
>> or the whole job.
>> 
>> In principle, the response from the block could still be the (text/plain)
>> phase, but I find that to be marginally misleading if other changes of state
>> are also triggering the unblock... I'm more in favour of redirecting to the
>> job url when unblocking as it is semantically correct although less efficient
>> in many typical cases.
>> 
>> 
>> One of Dave's use cases was that they want the client to be able to see and
>> react to results as they are added to the job (during the EXECUTING phase). I
>> think being able to expose partial results is a nice feature that enables some
>> interesting things without changing anything for those that don't need this.
>> 
>> My personal use case was similar: a web UI issuing TAP  queries and trying to
>> avoid (i) hammering the service with polling and (ii) extra apparent latency
>> by polling less frequently.  Dave's idea to expose partial results would also
>> allow one to support starting to read the async results before they were
>> completely written, which would let me have the robustness of async with the
>> faster response of sync.
>> 
>> Pat
>> 
>> On 03/06/14 09:33 AM, Mark Taylor wrote:
>>> Paul+GWS,
>>> 
>>> the Blocking Endpoints addition is a great idea, I hadn't noticed this
>>> was under consideration, sorry I couldn't attend the relevant GWS
>>> session at ESAC since I was elsewhere.  However, if I understand
>>> the current proposal as written in WD-UWS-1.1-20140527, it doesn't
>>> quite fit my requirements at least.
>>> 
>>> In topcat and stilts, I don't just want to wait for the job to complete,
>>> I'd like to display the current phase to the user as it transitions
>>> between phases, in particular the potentially long-lived phases not
>>> under client control, i.e. QUEUED, EXECUTING and maybe SUSPENDED.
>>> Since the current proposal only lets me do a long poll to detect
>>> the end of EXECUTING, I'm still going to have to do normal polling
>>> to find out when EXECUTING starts.
>>> 
>>> I thought about suggesting that the blockingphase endpoint should
>>> return whenever the phase changes, but that doesn't really work
>>> because you don't know for sure what the phase was when you called it.
>>> So instead, could that endpoint have a parameter, something like
>>> 
>>>    /{jobs}/(job-id)/blockingphase?not_equal_to=EXECUTING
>>> 
>>> which returns immediately if the phase is different from the
>>> given parameter value (EXECUTING in the example above), and
>>> otherwise blocks until it changes?
>>> 
>>> Mark



More information about the grid mailing list