UWS 1.1 job list, sorting NULL values
Paul Harrison
paul.harrison at manchester.ac.uk
Mon Oct 19 11:17:34 CEST 2015
Dear GWS,
this is my interpretation on this matter
* There is no creationTime property defined in UWS (so AFTER and LAST work only on startTime)
* StartTime is the time at which a job is put into the EXECUTING phase - so will be NULL before this (i.e. in the PENDING phase)
So my resolution would be that any jobs with a NULL startTime do not appear in either the AFTER or LAST filtered lists - i.e. these lists only contain jobs not in the PENDING phase - if a client wants to list these jobs then it uses the PHASE=PENDING filter (which is like Grégory’s (c) resolution below, but without introducing a creationTime).
I think that this could be accommodated into the current UWS 1.1 RFC as a small clarification.
There might be a case for introducing a creationTime property, but I do not think that it would be that helpful (the most interesting jobs are those that have run at some point) and would add extra complexity to the filtering.
The real distinction between the AFTER and LAST filters was that LAST allows a fixed number of entries to be displayed, rather than in some sense being “more recent”.
Regards,
Paul.
On 2015-10 -16, at 15:19, Grégory Mantelet <gmantele at ari.uni-heidelberg.de<mailto:gmantele at ari.uni-heidelberg.de>> wrote:
Dear Grid,
As Kristin knows, I have not yet implemented in the UWS-Library the AFTER and LAST filters for Jobs Lists (indeed, they were not yet in the working draft version I based my work on for Sesto's IVOA meeting). So, her question sounds really pertinent for me, and I would be particularly interested by the answer(s) of Grid members.
Re-Reading the paragraph highlighted by Kristin, I understand these 2 parameters as follows:
- AFTER filters jobs on startTime and sorts them by startTime
- LAST filters jobs on creationTime (" most recent jobs") and sorts them by startTime
Honestly, I've just discovered these parameters recently, and for me, instinctively AFTER and LAST are about the creation time. But it seems AFTER is designed to retrieve already started (and maybe finished...but they may also be running) jobs, and LAST to retrieve the most recent jobs (whatever is their phase). Why not!
However, I do not think the behavior of LAST is really coherent: we expect to get the last created jobs - filter on creationTime - but we get them sorted by startTime. For me it may be a bit confusing: the constraint and the sort for this parameter should be done on the same attribute: createTime (or startTime if it was the original intent).
Anyway, to go back on the startTime=NULL problem mentioned by Kristin, I would think that AFTER returns only already started jobs. So there is no issue for AFTER: only jobs with a startTime different from NULL would be returned.
However, for LAST there is an issue. If we suppose we keep the behavior written in the Proposed Recommendation, both strategies - putting the pending jobs at the end or at the beginning of the list - make equally sense:
a- at the end because the jobs will possibly be executed in the future (the beginning is the past, the end is the future, in the case of an ascending sorted list),
b- or at the beginning because the jobs are not yet executed and they should be remembered first to the user for execution (indeed, it was probably the reason of this LAST filter).
And again, it is also possible to do the same as for the AFTER filter:
c- not displaying the startTime=NULL's jobs (only the started jobs are visible).
But in this case, why put a filter on creation time and filtering implicitely afterwards on startTime? (in addition, we may get less jobs that the asked amount)
Personally I would change the behavior of LAST so that filtering AND sorting on the same attribute: createTime ; it is a more intuitive understanding of a LAST filter on a list. For that, as Kristin mentionned, an attribute "createTime" (or "creationTime" as you prefer), should be visible for the users.
Additional question: I consider jobs in ERROR or ABORTED phase as finished. In this sense, the start- and endTime should be set to something different from NULL (possibly to the same date-time if the job failed or has been cancelled either immediately or really fast). So, for me, only jobs in PENDING phase has a start- and endTime set to NULL. Is it wrong?
Maybe a clarification about when these two attributes should be set could be also added to UWS 1.1....but maybe it is too late for that...
Cheers,
Grégory
On 10/13/2015 12:31 PM, Kristin Riebe wrote:
Dear Grid members,
sorry to bother the list again with issues on UWS 1.1, but I think this
may be of interest for a number of people:
How are NULL values in datetimes to be treated when sorting?
(Is there an agreement on this within the IVOA?)
UWS 1.1 introduces filtering for the most recent jobs (LAST-keyword) or
for jobs started after a certain date-time (AFTER keyword). This refers
to the startTime for each job, see 2.2.2.1 Job list in
http://www.ivoa.net/documents/UWS/20150907/PR-UWS-1.1-20150907.pdf (page
13).
For pending jobs, in our UWS service at AIP, we set the startTime to
NULL or Nil in xml:
<uws:startTime xsi:nil="true"/>
(And some of our jobs in ERROR/ABORTED phase can also have no startTime
since they never started.)
Now, how are these to be treated when sorting? Should they be added at
the beginning or the end when sorting by increasing startTime? Or would
it make more sense to introduce a "createTime" and sort by this?
Did anyone finish including AFTER and LAST in their UWS 1.1
implementations, yet? How are you handling this?
Cheers,
Kristin
Dr. Paul Harrison
JBO, Manchester University
http://www.manchester.ac.uk/jodrellbank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/grid/attachments/20151019/a527868a/attachment.html>
More information about the grid
mailing list