UWS 1.0 suggestions

Sun Nov 22 23:59:50 PST 2009

On 2009-11 -21, at 19:32, Petr Skoda wrote:

>
> Hi all,
>
> I hope that the UWS RFC will be extended still to have time make  
> clear some of the not-so-obvious issues. During the implementation  
> phase we may still get some ideas and improvements worth of its  
> standardization.
>
> Having look at the http://www.ivoa.net/Documents/UWS/20090909/PR-UWS-1.0-20090909.html
>
> I have found errors in section 2.2.2.2 and appendix B
> in both cases there is typo "xlmns" instead of "xmlns"
> The xsd on the http://www.ivoa.net/xml/UWS/v1.0 is syntactically OK.

you are right, thanks.

>
> Now about some suggestions:
>
> We try to use the UWS pattern for allowing the people to do specific  
> computations using their datasets driven by some parametric files.  
> So most of the extensions given below are specific cases from our  
> experience - we wish to have the WWW browser as an only user  
> interface to work with such a system (uploading jobs, runing and  
> braking them, reading results, reruning with different parametrs,  
> deleting, zipping, downloading the results, sending the mails with  
> results after disconnecting from the service totally - we suppose  
> the deployment in a distributed environment - using the remote  
> queues submission (e.g. grid engines like qsub...)
>
>
> To facilitate the comfortable interaction between UWS service and  
> web browser (like a client front-end) it would be nice to describe  
> the job by more detailed tags that the client could use to present  
> either in the single job description (root-URI/jobs/jobid) and in  
> jobs list (root-URI/jobs).

you are free in the description of the job to put any metadata that  
you want within the jobinfo element - this is where you can add extra  
information that is not defined within UWS.
>
> The example of this is a name of user (although there is a "owner"  
> in UML schema and "ownerId" in the  JobSummary tag - it may be e.g.  
> full name of the user) or a additional description of a particular  
> job (e.g. in computation of stellar models, the name of a star and  
> additional description like method of computation and some remarks ...

no the owner of the job MUST be the user identifier of the user that  
created the job as obtained via IVOA standard authentication  
mechanisms - you should not use any other type of
>
> This simply the user enters into the web- based job creation form  
> and might be nice to keep this for housekeeping purposess e.g. on  
> the list of completed jobs as well as on detaild jobid info.
>
> So the question is - how to extend the tags to allow some additional  
> information to embed with job (e,g, make the jobinfo an complexType ?)

you can put any xml that you want within the jobinfo element - it is  
already defined allow xs:any content - it could not be more extensible.
>
> Then when the service returns jobs list - the element "jobs" - it is  
> now said it returns "uws:ShortJobsDescription"  but is is in fact only
> the "phase" plus JobIdentifier. Why not return the whole  
> "JobSummary" and let the client to extract the information  
> neccessary for displaing nice job roaster (e.g. using xslt style  
> sheet) ?

Mainly to limit the size of the data returned for the list of jobs
>
> And still there seems the issue of paging of long list of jobs...
>

it was decided to avoid the need for paging mechansims to retain  
simplicity - by having the relatively small amount of data for each  
job in the job list (see last answer) it means that there is of order  
100 bytes per job in the list - assuming that ~1MByte is a reasonable  
maximum download size for the job list, it means that there can be  
10,000 jobs in the list before the listing time starts to become  
intolerable.

> Here we see the solution to let the client ask either for list of  
> active jobs (e.g.   root-URI/jobs?phase=executing ) or all (default  
> root-URI/jobs)  or e.g those already completed etc ...

I think that this might be a good idea for a future revision of UWS.

>
> Simply it may be requested by the client to make the selection for  
> different purposes and so the server should do it and return what he  
> wants.
>
> We can foresee the  the usefullnes of such operation even without  
> the need for web client for e.g. telling the UWS servers by soem  
> administration application to list a jobs of particular user etc ...
> In principle the joblist ia quite complex compject and so some fine- 
> grained information may be required to extract from it directly.
>
>
> Another issue is the meaning of "quote"
> Instead of using crystal ball to estimate when the job is expected  
> to finish, it would be better to have some idea of allowed priority  
> on different servers - like e.g. the nice value for different queues.
> So if checking among servers i would use that with higher priority  
> queue.
>
> Bat again it may be on the client decision to modify this behaviour.  
> E.g. the higher priority queues might impose shorter Execution  
> Duration.

It is true that it is difficult to establish a absolute comparative  
meaning for quote in different circumstances, but if you implement  
your different queues as a set of UWS based services that effectively  
operated different priority queues as you suggest you can indicated  
the behaviour of each one by the relative values of the Quote and the  
Execution Duration. If for a particular set of job parameters you  
determine that a job will be costly you can set the quote to be  
greater than the execution duration on the "queue" (= UWS endpoint)  
for the "high priority" queues, which would indicate that the job  
would not be run for that endpoint. The client might try to run the  
job anyway, but in this case the server would simply fail the job  
immediately.

I think that trying to introduce the concept of queue priorities into  
a later version of UWS should be considered, but not for UWS 1.0 as it  
would take too long to fully establish the semantics.
>
>
> And the concept of Descruction Time is unclear in the document as  
> well.
>
> We think that this should be the time when the results will be  
> removed from the storage space - but suppose you want to use the UWS  
> server as your "external notebook" of your work - e.g. in the  
> concept of PDA- supercomputing.
>
> So many of your experiments are stored on the server and you decide  
> to rerun the particular one with different paramer sets but  
> basically same data sets(e.g. spectra, images ...) So its up to you  
> what experiment will be removed (if you go over quota you are not  
> allowed the create new jobs).
>
> However after some warning time the jobs will be removed anyway  
> (maybe you should receive some warning by the scheduler first to  
> allow you to copy the data).
>
There is nothing stopping you from implementing your UWS based service  
in a way that allows rather long destruction times to allow for a form  
of "long term" storage of the results. You could even operate the  
quota system that you describe with current UWS - all that is missing  
from the current standard is some form of semantics for expressing in  
a uniform way why new jobs are not being allowed to be created so that  
the client understands that they must destroy some jobs to be able to  
run new ones. Again, I would not want to delay v1.0 to try to sort  
these issues out, but I think that it would be possible to perhaps add  
the concept of quotas to a future version of UWS.

>
> Last issue concerns the possibility of restarting jobs with same  
> datasets but different control parameters. So the client (in a web  
> browser) might have the button for resubmitting the same computation  
> with different number of iterations (change of some control  
> parametrs that may be re-edited in a browser)
>
> As the jobs is in fact described by its set of parameters it might  
> be possible to use the whole "job" element to change particular  
> parametrs and resubmit

I am not sure that I see exactly how the UWS standard has any bearing  
on being able to perform an operation such as described above.
>
>
> We understand that the UWS is just describing the solution  
> implemented on Astrogrid - but lets thik about it as a universal  
> pattern for future VO services where part of them might have the  
> complicated workflows and both other automatic services and the  
> users through web clients would like to control them. Here the  
> suggestions given above might make this interaction easier and  
> better controllable.

Whilst it is true that Astrogrid services were a test bed for UWS  
ideas, they have evolved precisely to try to make them more universal,  
and simpler. There is no doubt that the design is biased towards being  
driven by simple clients rather than web browsers directly - however,  
I think that we have shown that there are techniques by which a html  
interface can be created by the same machinery that creates the UWS  
responses without compromising the standard UWS XML responses. Indeed  
my interpretation of section 2.2.2 is that if the client explicitly  
includes in a request an Accept header including "text/html" then the  
UWS service may return pure html rather than XML, but the UWS service  
must return XML or "text/plain" in the cases where HTML has not been  
explicitly requested.

I have said several times that I think that there are probably some  
useful concepts that could be added to a future version of the UWS  
standard (i.e queues and quotas). I would encourage current  
implementers to try creating services that do include these concepts  
(to prototype what is necessary) whilst at the same time conforming to  
UWS 1.0. I believe that it is possible to do this, with the only  
problem from the client's point of view being that sometimes it does  
not "understand" why job creation has been refused or that a short  
destruction time has been allocated.

Paul.