UWS 1.0 suggestions
Paul Harrison
paul.harrison at manchester.ac.uk
Sun Nov 22 23:59:50 PST 2009
On 2009-11 -21, at 19:32, Petr Skoda wrote:
>
> Hi all,
>
> I hope that the UWS RFC will be extended still to have time make
> clear some of the not-so-obvious issues. During the implementation
> phase we may still get some ideas and improvements worth of its
> standardization.
>
> Having look at the http://www.ivoa.net/Documents/UWS/20090909/PR-UWS-1.0-20090909.html
>
> I have found errors in section 2.2.2.2 and appendix B
> in both cases there is typo "xlmns" instead of "xmlns"
> The xsd on the http://www.ivoa.net/xml/UWS/v1.0 is syntactically OK.
you are right, thanks.
>
> Now about some suggestions:
>
> We try to use the UWS pattern for allowing the people to do specific
> computations using their datasets driven by some parametric files.
> So most of the extensions given below are specific cases from our
> experience - we wish to have the WWW browser as an only user
> interface to work with such a system (uploading jobs, runing and
> braking them, reading results, reruning with different parametrs,
> deleting, zipping, downloading the results, sending the mails with
> results after disconnecting from the service totally - we suppose
> the deployment in a distributed environment - using the remote
> queues submission (e.g. grid engines like qsub...)
>
>
> To facilitate the comfortable interaction between UWS service and
> web browser (like a client front-end) it would be nice to describe
> the job by more detailed tags that the client could use to present
> either in the single job description (root-URI/jobs/jobid) and in
> jobs list (root-URI/jobs).
you are free in the description of the job to put any metadata that
you want within the jobinfo element - this is where you can add extra
information that is not defined within UWS.
>
> The example of this is a name of user (although there is a "owner"
> in UML schema and "ownerId" in the JobSummary tag - it may be e.g.
> full name of the user) or a additional description of a particular
> job (e.g. in computation of stellar models, the name of a star and
> additional description like method of computation and some remarks ...
no the owner of the job MUST be the user identifier of the user that
created the job as obtained via IVOA standard authentication
mechanisms - you should not use any other type of
>
> This simply the user enters into the web- based job creation form
> and might be nice to keep this for housekeeping purposess e.g. on
> the list of completed jobs as well as on detaild jobid info.
>
> So the question is - how to extend the tags to allow some additional
> information to embed with job (e,g, make the jobinfo an complexType ?)
you can put any xml that you want within the jobinfo element - it is
already defined allow xs:any content - it could not be more extensible.
>
> Then when the service returns jobs list - the element "jobs" - it is
> now said it returns "uws:ShortJobsDescription" but is is in fact only
> the "phase" plus JobIdentifier. Why not return the whole
> "JobSummary" and let the client to extract the information
> neccessary for displaing nice job roaster (e.g. using xslt style
> sheet) ?
Mainly to limit the size of the data returned for the list of jobs
>
> And still there seems the issue of paging of long list of jobs...
>
it was decided to avoid the need for paging mechansims to retain
simplicity - by having the relatively small amount of data for each
job in the job list (see last answer) it means that there is of order
100 bytes per job in the list - assuming that ~1MByte is a reasonable
maximum download size for the job list, it means that there can be
10,000 jobs in the list before the listing time starts to become
intolerable.
> Here we see the solution to let the client ask either for list of
> active jobs (e.g. root-URI/jobs?phase=executing ) or all (default
> root-URI/jobs) or e.g those already completed etc ...
I think that this might be a good idea for a future revision of UWS.
>
> Simply it may be requested by the client to make the selection for
> different purposes and so the server should do it and return what he
> wants.
>
> We can foresee the the usefullnes of such operation even without
> the need for web client for e.g. telling the UWS servers by soem
> administration application to list a jobs of particular user etc ...
> In principle the joblist ia quite complex compject and so some fine-
> grained information may be required to extract from it directly.
>
>
> Another issue is the meaning of "quote"
> Instead of using crystal ball to estimate when the job is expected
> to finish, it would be better to have some idea of allowed priority
> on different servers - like e.g. the nice value for different queues.
> So if checking among servers i would use that with higher priority
> queue.
>
> Bat again it may be on the client decision to modify this behaviour.
> E.g. the higher priority queues might impose shorter Execution
> Duration.
It is true that it is difficult to establish a absolute comparative
meaning for quote in different circumstances, but if you implement
your different queues as a set of UWS based services that effectively
operated different priority queues as you suggest you can indicated
the behaviour of each one by the relative values of the Quote and the
Execution Duration. If for a particular set of job parameters you
determine that a job will be costly you can set the quote to be
greater than the execution duration on the "queue" (= UWS endpoint)
for the "high priority" queues, which would indicate that the job
would not be run for that endpoint. The client might try to run the
job anyway, but in this case the server would simply fail the job
immediately.
I think that trying to introduce the concept of queue priorities into
a later version of UWS should be considered, but not for UWS 1.0 as it
would take too long to fully establish the semantics.
>
>
> And the concept of Descruction Time is unclear in the document as
> well.
>
> We think that this should be the time when the results will be
> removed from the storage space - but suppose you want to use the UWS
> server as your "external notebook" of your work - e.g. in the
> concept of PDA- supercomputing.
>
> So many of your experiments are stored on the server and you decide
> to rerun the particular one with different paramer sets but
> basically same data sets(e.g. spectra, images ...) So its up to you
> what experiment will be removed (if you go over quota you are not
> allowed the create new jobs).
>
> However after some warning time the jobs will be removed anyway
> (maybe you should receive some warning by the scheduler first to
> allow you to copy the data).
>
There is nothing stopping you from implementing your UWS based service
in a way that allows rather long destruction times to allow for a form
of "long term" storage of the results. You could even operate the
quota system that you describe with current UWS - all that is missing
from the current standard is some form of semantics for expressing in
a uniform way why new jobs are not being allowed to be created so that
the client understands that they must destroy some jobs to be able to
run new ones. Again, I would not want to delay v1.0 to try to sort
these issues out, but I think that it would be possible to perhaps add
the concept of quotas to a future version of UWS.
>
> Last issue concerns the possibility of restarting jobs with same
> datasets but different control parameters. So the client (in a web
> browser) might have the button for resubmitting the same computation
> with different number of iterations (change of some control
> parametrs that may be re-edited in a browser)
>
> As the jobs is in fact described by its set of parameters it might
> be possible to use the whole "job" element to change particular
> parametrs and resubmit
I am not sure that I see exactly how the UWS standard has any bearing
on being able to perform an operation such as described above.
>
>
> We understand that the UWS is just describing the solution
> implemented on Astrogrid - but lets thik about it as a universal
> pattern for future VO services where part of them might have the
> complicated workflows and both other automatic services and the
> users through web clients would like to control them. Here the
> suggestions given above might make this interaction easier and
> better controllable.
Whilst it is true that Astrogrid services were a test bed for UWS
ideas, they have evolved precisely to try to make them more universal,
and simpler. There is no doubt that the design is biased towards being
driven by simple clients rather than web browsers directly - however,
I think that we have shown that there are techniques by which a html
interface can be created by the same machinery that creates the UWS
responses without compromising the standard UWS XML responses. Indeed
my interpretation of section 2.2.2 is that if the client explicitly
includes in a request an Accept header including "text/html" then the
UWS service may return pure html rather than XML, but the UWS service
must return XML or "text/plain" in the cases where HTML has not been
explicitly requested.
I have said several times that I think that there are probably some
useful concepts that could be added to a future version of the UWS
standard (i.e queues and quotas). I would encourage current
implementers to try creating services that do include these concepts
(to prototype what is necessary) whilst at the same time conforming to
UWS 1.0. I believe that it is possible to do this, with the only
problem from the client's point of view being that sometimes it does
not "understand" why job creation has been refused or that a short
destruction time has been allocated.
Paul.
More information about the grid
mailing list