<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On 30 Oct 2009, at 15:58, Tom McGlynn wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>Guy Rixon wrote:<br><blockquote type="cite">On 30 Oct 2009, at 13:45, Tom McGlynn wrote:<br></blockquote><blockquote type="cite"><blockquote type="cite">I've had a few questions with the implementation of the asynchronous access for TAP.  Most of these are relevant to UWS document generally rather than just TAP so I've copied the GWS group in this mail.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Tom McGlynn<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">-- UWS general questions --<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- As defined a user needs to always do two web actions to start the service.  Is there some reason that the user cannot simply request the service to start running immediately?  I suspect that that is what the user wants to do in 99% of the cases.  It would be much easier for clients too.  The example given in the UWS document of starting a job omits the error checking that the a user presumably should do after starting the job.  Why not allow<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">    {root}/tap/async?request=doQuery&query=...&phase=RUN<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">to both create and start a query?  [Describe it as a POST if you prefer.]<br></blockquote></blockquote><blockquote type="cite">That would work, but would have to be POSTed.<br></blockquote>While this is what I want, I think it's not what the standard currently specifies.  E.g.,<br>  UWS 2.1.3  PENDING ... This is the state into which the job enters when it is first created.<br><br>  UWS 2.2.3.5 A job may be started by POSTing to the /{$jobs}/{job_id}/phase URI. ...<br><br>There is no other way of starting the job specified.  Note also that 2.2.3.5 says nothing about the current state of the job (vis a vis the discussion a couple of points below).<br></div></blockquote><div><br></div><div>Agreed. It's a possible revision to UWS.</div><br><blockquote type="cite"><div><br><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- I'm continue to be confused by the benefits conferred by various practices.  Why do we require POSTs specifically in a number of intances?  E.g., what would be wrong with using<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">   ../jobid/phase?phase=RUN<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">as an HTTP GET rather than an HTTP POST.  Since my code cannot tell the difference between these I certainly will be supporting both, but, other than bowing to the mantra REST, I'm not sure why it's supposed to matter.<br></blockquote></blockquote><blockquote type="cite">Whether or not a web service follows REST principles, it /has /to distinguish between requests that change the service state and requests that are idempotent. This is a basic part of HTTP. Starting a job changes the service state by creating web resources for that job. Sending the same query twice gets you two jobs doing the same query; they have separate web-resources. Therefore, not idempotent; therefore a POSTed request.<br></blockquote><blockquote type="cite">GET responses can be cached, and the caching is out of your control as a service provider - it may be on the user's LAN (HTTP proxy) or in their client (browser cache). If you send the same query twice then via GET, for the second request you could get the response for the first, pulled from the cache, and no new job. This doesn't happen too often but when it does it's brain-bendingly harder to debug.<br></blockquote><blockquote type="cite">I suggest that your code must not accept UWS create-job requests via HTTP GET. Your users won't like it if they get given the wrong answer from a cache. And Google tend to spider all the GETable URLs so you don't want them creating jobs.<br></blockquote><blockquote type="cite"><br></blockquote>There may be costs associated with having to deal with the caching of GET requests and I should have been more temperate here.  I've occasionally run into this myself when building AJAX services.  But there are also substantial benefits to being able to use GET requests and in practice I find that these greatly outweigh the costs in all the cases that I've had to deal with.<br></div></blockquote><div><br></div><div>It's not a trade-off. You cannot provide state-changing operations via HTTP-GET and comply to the HTTP RFC. It's simply illegal.</div><br><blockquote type="cite"><div><br><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- Similarly I don't think that there should be a strict limitation of the coding used in sending requests.  There may be a requirement that a given encoding be supported but there should not be a requirement that it be used.  As with POST/GET this level of HTTP detail is handled below the UWS logic in our implementation, so for services that the HEASARC supports we'll be allowing multipart/form encoding as well unless someone can tell us why we should reject such requests.<br></blockquote></blockquote><blockquote type="cite">You're free to support this as well as the stated encoding because it doesn't break anything. However, if you write a client that assumes this encoding then it won't work on all implementations. So it seems pointless to add the feature even in the service.  Personally, I think that supporting broken clients in this way is not helpful.<br></blockquote><br>My concern here is that you are coupling the UWS standard to a lower level of detail in the HTTP protocol than is necessary.  E.g., suppose we have a UWS service that includes file upload parameters.  Such a service is going to use mulitpart/form encoding for some of its interactions.  As the standard is currently written it must switch back and forth between encodings depending upon what's being done.<br></div></blockquote><div><br></div><div>The encoding is only mandated for POSTs to start a job, to abort a job and to change the destruction time. It's not specified for the POST to start a job because that POST is specified in the <i>application</i> of UWS, e.g. in TAP.</div><div><br></div><div>We have to specify some encoding; we can't leave it out or we get no interoperability. That's because this is about web services. If it were a web-browser interface with its own HTML forms then we'd leave the encoding out of the spec and fix it in the HTML because both the service and the form would be in one implementation.</div><div><br></div><br><blockquote type="cite"><div><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- From what states is a user allowed to start a job?  E.g., can a user attempt to restart a job that has previously had an error or aborted? Could the user change the parameters and then rerun the same job?  I'm guessing this isn't supposed to happen, but I didn't see where it was forbidden.<br></blockquote></blockquote><blockquote type="cite">You can only change things while it's pending. If you need to re-run a failed job then you have to resubmit it.<br></blockquote>I don't see this stated in the protocol anywhere.  There is a statement for phase ERROR that "... No further work will be done..." which might be taken to imply you cannot do anything with a job that failed with an error, but there is nothing anywhere else.<br><br>2.2.4 describes a message pattern, but the diagram is labeled 'Typical Message Pattern' and there are clearly a number of exceptions (e.g.,<br>when there is an error or abort before execution)<br><br>A statement somewhere that phases are ordered like<br>    PENDING<br>    QUEUED<br>    EXECUTING<br>    COMPLETED-ERROR-ABORTED<br>and that you can change only to a later state would clarify this.<br></div></blockquote><div><br></div><div>OK, this is "just" a change in the description and we can add this clarificaction.</div><br><blockquote type="cite"><div><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- What is supposed to happen if there is a problem in creating the job.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Should a job be created with an immediate status of ERROR?  Is there any way of flagging an error if the system cannot create even an error job?  E.g., we're going to use the database to store all job information. What are we supposed to do if the database is down?  It would be nice to be able to inform the user of an error in a standard way.<br></blockquote></blockquote><blockquote type="cite">If you can create the job at all then you should immediately set the phase to ERROR and make the error document available. If you can't do this, then I guess giving up with a 500 "I'm completely stuffed" error is reasonable. By extension, UWS clients need to deal minimally with 500 errors as well as with proper error-documents.<br></blockquote>...<br><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">-- TAP specific questions. --<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- The description of where to get the TAP result in an async request is not given (as far as I can see) in what is described as the normative parts of the document.  There it says that result will be in<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">  root/async/jobid/results/<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">but this is the list of results and can, in principle, contain a number of results. Only in section 5.2 which is described as informative does it say that the result document is .../results/result.  Is this actually  a requirement or can the result be named anything?<br></blockquote></blockquote><blockquote type="cite">In my service implementation, I take it to be a requirement. In my client implementation it currently assumes the one result with the standard name but I plan to make it parse the list. (In case we add to the results list in future TAP versions.)<br></blockquote>I think we have to take it as a requirement now, but it really should be specified in a normative section of the document (or changed in response to the issue I raised below).<br><br><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">- The UWS standard discusses the naming of results.  Does TAP require a specific name for the result?  In fact it looks like the way UWS is supposed to be used the jobid/results returns a document that looks like<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">   <results><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">      <result id='someid' xlink:href='someurl' /><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">      <result id='anotherid' xlink:href='anotherurl' /><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">   </results><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">and the user is supposed to find the id of the desired result and use whatever URL is given there, not use a specifically defined URL.  I'm guessing the the ID attributes of the <result> fields is the UWS name. The UWS standard says<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"> "When a protocol specifies standard results it must do so by<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">  naming those results; the names appear in the Results list in<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">  addition to the URI's.  Not all results need to be named, sometimes<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">  the meaning of the result is obvious from the context and the<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">  name is omitted."<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Since the second sentence here seems to contradict the first it's a bit hard to follow, but my reading of this is that it would be better for TAP to specify a name for the output result rather than a specific URL.<br></blockquote></blockquote><blockquote type="cite">For a given service-protocol incorporating UWS, a result can be in one of thread cases:<br></blockquote><blockquote type="cite"> - formally named and mandated by the protocol: name is fixed; result must be present when status=COMPLETED; clients can assume these things and bypass the results list;<br></blockquote><blockquote type="cite"> - formally named and made optional by the protocol: name is fixed, result might not be present on job completion; clients can either use the results list to find whether it's there or just get its and handle the 404 if it's missing;<br></blockquote><blockquote type="cite"> - not formally named: neither URI nor presence is predictable: clients must use the links in the results lists to find these results.<br></blockquote><blockquote type="cite">TAP has one result that is both named and mandated and nothing in the other two categories.<br></blockquote><br>According to the UWS protocol -- where I grant it is a bit unclear so I'm working partially from the UWS <job> example though the text quoted above certainly supports it- the name of the result is independent of the URI used to access it.  Thus as far as I can tell TAP mandates a result, but does not -- in this UWS sense -- name it.  TAP specifies only the URI.  That seems a violation of the UWS standard.<br></div></blockquote></div><br></body></html>