Proposals for VOSpace (paginated response)
Dave Morris
dave.morris at metagrid.co.uk
Sun Jun 20 05:30:29 CEST 2021
Apologies, I forgot this was in the specification.
The text describing how this works is buried in the Response section of
the getNode method, which makes it easy to miss.
As a start, I've added an issue to revise the text to make this clearer.
https://github.com/ivoa-std/VOSpace/issues/4
This wouldn't change the technical definition, just promote the
description of how pagination works into a separate sub-section clearly
labelled 'pagination'.
If we want to take it further, possibly by adding the exception that Pat
proposes below, then that would be a new issue.
-- Dave
On 2021-06-18 22:00, Patrick Dowler wrote:
> The current spec does support pagination when listing child nodes of a
> container (*uri* and *limit* params), but implementation is complex. We
> have two VOSpace implementations that illustrate quite well.
>
> Impl 1: relational database + object store
> Here, it is easy enough to implement pagination because it is just a
> couple
> extra things injected into the SQL query to the DB. The server picks
> the
> default order, but we also added support for a custom optional param so
> the
> client could control the order: name, lastModified date, or
> contentLength.
>
> Impl 2: only a posix file system
> Here, it is really hard to implement pagination because the posix
> directory
> listing APIs don't have any concept of order (iirc, I determined it
> lists
> in inode order so you could get some strangeness if an inode is re-used
> --
> rename? -- during listing). It also looks more or less impossible to
> scale
> paginated listing with many children: with each request, you have to
> start
> at the beginning of the list and skip over previously seen entries so
> it
> gets slower and slower with each "page" of children. This service
> cannot
> support the custom sorting on the server side either.
>
> So, I would also like to improve the spec here but would like to see
> something where a service that cannot support pagination (just stream
> output) can be effectively used: clients will need to be able to figure
> out
> which to expect or at least if they got all the rows or not. That
> really
> means support for the *uri* parameter would be optional and maybe just
> responding with an error with a specified "fault" term would suffice.
> The
> *limit* param is easy enough to implement (like MAXREC in DAL
> standards) in
> both cases.
>
> --
> Patrick Dowler
> Canadian Astronomy Data Centre
> Victoria, BC, Canada
>
>
> On Wed, 16 Jun 2021 at 22:31, Dave Morris <dave.morris at metagrid.co.uk>
> wrote:
>
>> Hi Sonia,
>>
>> You raised several good suggestions in your email. To avoid confusion
>> I'll reply to each one in a separate email thread.
>>
>> On 2021-06-11 13:31, Zorba, Sonia wrote:
>> > 7. On the getNode endpoint add parameters to perform paginated
>> > requests.
>> > Useful for nodes having too many children.
>>
>> Paginated response sounds simple, but it turns out to be complicated
>> to
>> implement.
>>
>> We would need to define a design that does not put a heavy load on the
>> server, can reliably handle the insertion or deletion of nodes between
>> requests without producing duplicate rows in the results, and does not
>> require the use of a relational database to implement it.
>>
>> As far as I know, everyone who has looked at this has decided that it
>> is
>> easier to do it on the client side than on the server side. Perhaps
>> someone would like to look at this again and propose a definition for
>> how a paginated response could work?
>>
>> For me, I see pagination as a client side display function rather than
>> a
>> server side data access function. Is there a strong use case for doing
>> this on the server side ?
>>
>> Bear in mind that even if we did define a new property for pagination,
>> existing version 2.1 services would not understand it. So unless we
>> make
>> the new property mandatory, everyone adopts the new standard, and we
>> deprecate the version 2.1 standard, clients would still have to cope
>> with large responses from version 2.1 services.
>>
>> Cheers
>> -- Dave
>>
>> --------
>> Dave Morris
>> Research Software Engineer
>> Wide Field Astronomy Unit
>> Institute for Astronomy
>> University of Edinburgh
>> --------
>>
>>
More information about the grid
mailing list