TAP Implementation issues (cont'd): UWS

Guy Rixon gtr at ast.cam.ac.uk
Mon Nov 2 09:13:14 PST 2009


When you're actually getting a representation of a resource with HTTP- 
GET, it's possible to defeat the caching using HTTP headers in the  
responses. (I admit to being hazy about the details; I've never had to  
set this up explicitly.) This is part of HTTP and there is quite a lot  
about it in RFC2616:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13

My experience is that it Just Works if you use GET only for its proper  
purpose.

You might also manage cache transparency if you were misusing HTTP-GET  
to change or create a web resource. However, note section 13.4 of the  
RFC which seems to suggest subtle behaviour by some user agents. Even  
if it seems to work it's still against the rules.

Guy


On 2 Nov 2009, at 16:56, Tom McGlynn wrote:

> Guy Rixon wrote:
>> Tom,
>> if you want to get the job list then go ahead and do HTTP-GET.  
>> That's  part of UWS (although implementations may restrict the set  
>> of you  reported to be those owned by the caller). What you can't  
>> do is use  HTTP-GET to submit a query via UWS. If you want to use  
>> GET to do a  query then you're doing a synchronous query by  
>> definition.
>> Cheers,
>> Guy
>
> But I recall from earlier in this discussion some clever fellow  
> said...
>
>> GET responses can be cached, and the caching is out of your control  
>> as a service provider - it may be on the user's LAN (HTTP proxy) or  
>> in their client (browser cache). If you send the same query twice  
>> then via GET,
> > for the second request you could get the response for the first,  
> pulled from the cache,
>> and no new job.
>
> If what you said earlier is correct, then I can't rely on what I get  
> back from a GET call.  I might be getting a cached response and not  
> the current state of the system.  If caching is truly an issue, then  
> there doesn't seem to be any reliable way to get the job list.
>
> Tom
>
>> On 2 Nov 2009, at 14:13, Tom McGlynn wrote:
>>> Paul Harrison wrote:
>>>> Guy has already done a good job of answering most of these points  
>>>> - I
>>>> The UWS design of the two stage process is for two principal  
>>>> reasons
>>>> a) to be able to manipulate job metadata parameters before the  
>>>> job is
>>>> run - e.g. the DestructionTime - and receive the feedback from the
>>>> service whether it is prepared to honour such requests before   
>>>> actually
>>>> committing the job.
>>>> b) to allow complete parameter namespace freedom on the job  
>>>> creation
>>>> step - i.e. if PHASE is used by UWS then it could not be a  
>>>> parameter
>>>> for the implementation protocol.
>>>> So if for a particular implementation using UWS there is no problem
>>>> with meeting that second condition, then there is no particular   
>>>> reason
>>>> why job metadata parameters could not be included with the  
>>>> initial  UWS
>>>> job creation step if desired - this would require revision of the  
>>>> UWS
>>>> specification to include this possibility - I think that this is a
>>>> small enough change to be added into the document as part of the   
>>>> RFC -
>>>> it does have a larger impact on possible service implementers  
>>>> however
>>>> - their code might not be structured to allow this change easily.  
>>>> For
>>>> a generalized UWS client the change would not be so great, all that
>>>> would happen is that after the initial submission a job object  
>>>> would
>>>> be returned with the PHASE=EXECUTING, and general clients should  
>>>> not
>>>> make any assumptions about state in their coding, so should  
>>>> probably
>>>> still be able to react appropriately.
>>>> Just as a side note to show that UWS is not so strange in this
>>>> multiple interaction between client and server - consider what   
>>>> happens
>>>> when a web browser loads a web page - it does the initial get of  
>>>> the
>>>> html, then parses this html and then gets images, javascript etc.
>>>> before the page is shown to the user.
>>> I trust the goal is not to require that UWS services need to have   
>>> the complexity of an interactive visual Web browser.  The  
>>> protocol  should cater to simple applications as well.
>>>
>>> I'd be perfectly happy with a change that made it clear that the   
>>> request that created the job could return it in any state.  In  
>>> fact,  even without the desire to be able to start jobs at  
>>> creation, that  is probably needed to accommodate the situation  
>>> where there is a  problem detected in creating the job but we want  
>>> the user to be able  to parse the error with IVOA protocol level  
>>> error handling.
>>>
>>> The specific text that I find problematic is in 2.1.3:
>>>
>>> PENDING: the job is accepted by the service but not yet committed   
>>> for execution by the client.  In this state the job quote can be   
>>> read and evaluated.  This is the state into which a job enters  
>>> when  it is first created.
>>>
>>> in conjunction with
>>>
>>> 2.2.3.1 Creating a job.
>>>
>>> Posting a request to the job list creates a new job (unless the   
>>> service rejects the request).
>>>
>>>
>>>
>>> I would suggest something like the following changes
>>> -
>>> In 2.1.3 add somewhere
>>>
>>>  Phases are ordered with PENDING before QUEUED, QUEUED before   
>>> EXECUTING and EXECUTING before the trio of COMPLETED, ABORTED and   
>>> ERROR.  The state of a job may change only to a later state but  
>>> need  not pass through any intermediate state.  A job may be  
>>> created in  any state.
>>> -
>>> Delete the last sentence in the definition of PENDING.
>>> -
>>>
>>>
>>> I think it would be desirable to suggest how a job could be  
>>> created  in the run state even if this is not required by the  
>>> standard  It  would be possible to do this without polluting the  
>>> parameter name  space by specifying a new URI for that.  E.g., $ 
>>> {jobs} creates a new  request but does not start it.  ${jobs}/run  
>>> could create and start  the job if that is permitted in the given  
>>> service.  However I find  the worry about pollution of the  
>>> parameter name space less than  compelling, since we require  
>>> certain parameters to be used in calls  to start the job running  
>>> or to alter other aspects of the job.  It  would be poor practice  
>>> to have phase= mean one thing in a job  creation request and mean  
>>> something else in a job update request.   In effect, these  
>>> parameters are reserved already.
>>>
>>>
>>> This discussion brought up another thing that's not really clear.   
>>> 2.2.3.1 has the little parenthetical phrase "(unless the service   
>>> rejects the request)" which is neither explained, nor is the  
>>> action  implied in rejection specified.  I think something like:
>>>
>>>  Errors may occur in the creation of the job.  Where possible a  
>>> job  should be created in the ERROR phase with a error message  
>>> that  describes the problems.  If this is not possible, an HTTP  
>>> 500 error  must be returned.
>>>
>>> would be clearer for implementation and should replace that phrase.
>>>
>>>
>>> Finally,  one new issue/concern.  [This probably reflects a lack  
>>> of  understanding on my part but if so then perhaps it could be   
>>> clarified in the standard.]:  It doesn't seem like there is any   
>>> valid way to get the current job list.
>>>
>>> I can't do a GET request for  /{$jobs} because that's cacheable  
>>> and  the list is dynamic.  And I can't do a POST request for it  
>>> since a  post to /{$jobs} means that I'm creating a new job and  
>>> I'm supposed  to be redirected to the job information for the  
>>> newly created job.
>>>
>>> So how do I get to it?
>>>
>>> In my TAP implementation I assume that any request to create a  
>>> new  job needs to have a REQUEST= parameter and if I don't see  
>>> this I  return the job list.  If I do, I create the new job.   
>>> However this  doesn't seem to be literally correct.  I suppose you  
>>> could say that  I'm 'rejecting' the request and since that  
>>> behavior is undefined I  can do anything I want. Relying on  
>>> undefined behavior doesn't seem  satisfactory.
>>>
>>> 	Tom
>>>
>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/grid/attachments/20091102/ec36e686/attachment-0003.html>


More information about the grid mailing list