Question about VOSpace and cold storage

Dave Morris dave.morris at metagrid.co.uk
Wed Dec 16 02:18:45 CET 2020


Hi Sonia, Brian

The VOSpace specification was designed to act as an abstract interface 
layer, hiding the details of the underlying implementation from the 
client.

A useful way to look at questions like this is how would an external 
user expect it to work and what would they want to know ?

My own guess is that given a status of [PENDING|RUNNING|COMPLETED|ERROR] 
for a transfer job, most people would expect COMPLETED to mean all parts 
of the job have completed and all of the bytes were transferred 
successfully.

If we set the status to COMPLETED as soon as the negotiation is done, 
then there is no way for the service to change that to ERROR later on if 
something goes wrong with the byte transfer (state change rules for UWS 
jobs would exclude a transition from COMPLETED to ERROR).

Checking the job status is the only way that a client who is not 
directly involved in the transfer can tell if the the bytes have been 
transferred successfully or not. Knowing that the negotiation has been 
completed but not being able to tell anything about the actual byte 
transfer is less useful to a client.

If you clicked a [download link] on a website and your browser 
immediately set the status to [done], you would be impressed at how fast 
seemed to be, but you would be disappointed if you looked in your 
downloads directory and the data wasn't there.

At the moment I'm not convinced we should change the way that this 
works.

Hope this helps,
-- Dave

--------
Dave Morris
Research Software Engineer
Wide Field Astronomy Unit
Institute for Astronomy
University of Edinburgh
--------

On 2020-12-15 08:30, Zorba, Sonia wrote:
> 
>> Regarding the 'phase' in the asynchronous pullFromVoSpace operation:  
>> the
>> specification is probably not clear enough about how the jobs phases
>> reflect transfer negotiation and the resulting byte transfer.  With 
>> our
>> 'vault' implementation (the one based on distributed object storage) 
>> we
>> took a different approach:  our jobs are in the EXECUTING phase during 
>> byte
>> transfer, and then are set to COMPLETE or ERROR when that's done.  
>> However,
>> this is problematic as the service handling the byte transfer must 
>> make a
>> callback to vospace to set the final phase.  This introduced an 
>> undesirable
>> coupling between vospace and the service handling the byte transfer.  
>> Also,
>> since that callback can fail, we need to monitor and correct those 
>> failures
>> in an out-of-band process.  If we get a chance to refactor this we 
>> will
>> probably choose to set the phase to COMPLETE or ERROR immediately 
>> after
>> transfer negotiation is complete, and consider the ensuing byte 
>> transfer an
>> operation that is outside the scope of the VOSpace protocol.
>> 
> 
> In the recommendation document, the first pullFromVoSpace example ends 
> with
> "On successful file transfer completion the job will be COMPLETED", so 
> it
> seems that your implementation is following the recommendation. Do you
> think that this behaviour could be redefined in newer versions of the
> document?
> 


More information about the grid mailing list