UWS 1.1 comments

Thu Oct 20 00:06:12 PDT 2011

Hi,

Given the difficulties with my attempt to join the Registry session the other day via Skype, I thought that I would send an email with my thoughts just in case Skype fails us again....

Here is my response to the presentation by Normand, Malapert & Le Sidaner  http://www.ivoa.net/internal/IVOA/InterOpOct2011GWS/normand_malapert_lesidaner_uws11_IVOA2011.pdf - Slightly unsatisfactory again, as this has ended up as an email rather than a "real-time" conversation.

In general
--------------

1.The "REST purity" of the design UWS  was constrained by some pragmatic criteria (and anyway no-one should be a REST zealot ;-))

  * it should be possible to drive a UWS service from most browsers (without javascript)
  * it should fit in with existing patterns of use of the S*AP protocols

2. Any changes to UWS need to be backwards compatible so as not to invalidate any existing services - this is particularly important as UWS is used by TAP. This means in general that existing behaviours cannot be changed, and any new behaviours cannot be mandatory - unless I guess that there is a majority amongst the people who have already implemented services and clients to make the "disruptive" change.

3.  UWS is meant to be able to act as a uniform façade onto a grid/cloud backend whilst simplifying and unifying some common concepts of job control systems. Several of the suggestions are to remove features which might be relevant in a more generalized model of a job control system, but for now do not seem to have an immediate need. It should be noted that in order to reach a consensus in a timely matter on UWS 1.0, other metadata (e.g. quotas, priorities)  that was not absolutely necessary for basic operation was not included. So I would expect future versions of UWS to add to the metadata by including these more difficult to generalize areas.

Specifically
--------------

Looking at the presentation page by page.

1. Page 4
   The reason for the two step execution of a job is so that the job with its parameters can be evaluated by the service and then changes be made to the job metadata that might affect the execution of the job. It was already recognised that it might be desired to just accept all the job execution defaults and put the job into an executing state in one operation - see http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#d1e1375 - by including a PHASE=RUN amongst the initial job parameters - perhaps this needs to be made clearer and could be explicitly separated from the JDL by saying that job metadata parameters are passed during the job creation step only in "query" part of the creation URI not in the POST body.

2. Page 5
   What is the tunnelling API that is mentioned wrt having only DELETE to delete jobs? -  Whilst the addition of responding to a POST with ACTION=DELETE is a small amount of extra work for the author of the UWS server side, they will be thanked by the author of a browser based UWS client side implementation as they can do this with a one button FORM that is guaranteed to work on just about every browser implementation without having to write any ticky javascript to be able to send a DELETE http method.

3. Page 6.
  I think that everyone is agreed that the Quote is difficult for the service to provide (It even says so in the standard http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#Quote) and perhaps we should not have included it in the initial standard. However the service is allowed to say "don't know".   ExecutionDuration is supposed roughly equivalent to CPU time. The CPU time is an exact measure and job control systems often limit the CPU time that can be given to a particular job. I agree that the language used in the description of ExecutionDuration is rather misleading as it makes it seem like 	ExecutionDuration is just like Quote because of the phrase "wall clock time" (though because of the possibility of a job being suspended and not knowing exactly when a job will start once queued they can never be exactly equivalent) - the intention was to say that if CPU time was not available then wall clock time could be used as a measure (but the wording says that it should always be used) - I think that the solution here is to make sure that the ExecutionDuration definition is more carefully worded.

4. Page 7.

There is a clear use case for users wanting to be able to set an ExecutionDuration if the service that they are using has quotas - they can stop a single job (possibly unexpectedly because of the input parameters) from using all of their quota in one execution thus preventing other jobs from being run. Similarly setting the DestructionTime when the job is initialized can be useful to make sure that the job (and associated storage) is deleted in a timely fashion - preventing them from exceeding a storage quota if they are submitting many jobs in succession. Another use of this facility is that the default destruction time on a service may be less than the maximum that the service will allow, so the client can request extra time than they would be given by default.

5. Page 8.

As already said you can start the job with all its parameters in one step already in V1.0 or you can do it in several steps using feedback from the server to fine tune the job metadata. Also note that you are not allowed to create new parameters after the initial POST, only potentially change their values - and again this is an optional feature for the server to support - the only place that the server must support setting parameter values is in the initial POST.

6. Page 9
The most basic use case for a user being able to abort a job is if they submit a job and start it executing and then realise that they have made a mistake (e.g. with a parameter value) and that will cause the job to run for days when it should only take minutes - they can be a good citizen and abort the job.

7. Page 10.

I am not a great fan of pagination myself (it is more complex for both the server author and the client author). However, if there is a perceived need for this facility at the meeting, then it must be that HTTP GET on /{jobs} returns the whole list and *NOT* a paginated version of the list - mainly for backwards compatibility, but also because it makes no sense to return the paginated version (how big is the page?) - the desire for pagination should be always indicated by the relevant parameters in the query part of the request URI. Pagination would also require a change the the UWS schema (to indicate that only part of the response is in the job list) which would be disruptive.

I actually feel that some standard filtering - e.g. only list jobs according to phase, jobs newer than date etc. would be better than pagination...

8. Page 11.

Authentication is orthogonal to the UWS specification, and what it says about it (http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#security) is probably sufficent, as authentication is dealt with by the http://www.ivoa.net/Documents/latest/SSOAuthMech.html standard. I have long been of the opinion that the SSOAuth standard is not really sufficient on its own for  creating a practical SSO system as it relies on X509 user certificates to achieve the "Single" part that and X509 certificates have not reached a wide enough user base. BTW your suggestion to use basic authentication is currently disallowed by the SSOAuth standard and that is why UWS returns 403 for the areas that the user is not allowed to see, because 401 implies that the client could try again with using Basic Auth.

Anyway there is still much to be debated on the practical use of authentication mechanisms in the IVOA, but is is not directly a UWS 1.1 issue.

9. Page 12.

What you describe is just about the only option that you have in UWS1.0 if your JDL is not expressible as simple parameter/value pairs and cannot easily be the legal content of a <Parameter> element. It does say this, but perhaps only by looking in two places section 2.2.2.4 http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#d1e1353 and section 2.1.11 http://www.ivoa.net/Documents/UWS/20101010/REC-UWS-1.0-20101010.html#ResultsList2. However a client should be able to pick up a parameter value whether is is given "in line" or "by reference", so if the service can express its JDL as the legal content of a parameter value in line, then it can, although I agree that it is a better choice to do it "by reference".

Other
-------

On some points from Dean Hinshaw's DataScope and UWS talk

* I think it is OK for results to appear before a job has finished - it is not against the spirit of the standard - indeed is part of the reason why Aborting a job can leave partial results.
* It is definitely not OK to return a result value in-line - it is invalid against the current schema - I cannot really remember why we did not allow both in-line and by-reference values for results as we do for parameters because it seems sensible to me now, but would be a disruptive change at this stage to allow content in the result element as it would require a 

Conclusion
---------------

I think that UWS 1.1 should be about clarification of the UWS 1.0 standard rather than attempting to make changes to the basic model - frankly it is too late for that now. I think that the presentation highlights areas where more explanation is needed. A future version beyond 1.1 could introduce new extended features to the UWS pattern.

Paul.

p.s. I think that I had almost forgotten about this page myself - http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/UWSEnhancement - a place for suggestions for UWS enhancements.

Dr. Paul Harrison
JBCA, Manchester University
http://www.manchester.ac.uk/jodrellbank