<div dir="ltr"><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">The current spec does support pagination when listing child nodes of a container (<b>uri</b> and <b>limit</b> params), but implementation is complex. We have two VOSpace implementations that illustrate quite well.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Impl 1: relational database + object store</div><div class="gmail_default" style="font-size:small">Here, it is easy enough to implement pagination because it is just a couple extra things injected into the SQL query to the DB. The server picks the default order, but we also added support for a custom optional param so the client could control the order: name, lastModified date, or contentLength.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Impl 2: only a posix file system</div><div class="gmail_default" style="font-size:small">Here, it is really hard to implement pagination because the posix directory listing APIs don't have any concept of order (iirc, I determined it lists in inode order so you could get some strangeness if an inode is re-used -- rename? -- during listing). It also looks more or less impossible to scale paginated listing with many children: with each request, you have to start at the beginning of the list and skip over previously seen entries so it gets slower and slower with each "page" of children. This service cannot support the custom sorting on the server side either.<br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">So, I would also like to improve the spec here but would like to see something where a service that cannot support pagination (just stream output) can be effectively used: clients will need to be able to figure out which to expect or at least if they got all the rows or not. That really means support for the <b>uri</b> parameter would be optional and maybe just responding with an error with a specified "fault" term would suffice. The <b>limit</b> param is easy enough to implement (like MAXREC in DAL standards) in both cases. <br></div><div class="gmail_default" style="font-size:small"><br clear="all"></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div>--<br></div><div>Patrick Dowler<br></div>Canadian Astronomy Data Centre<br></div>Victoria, BC, Canada<br></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 16 Jun 2021 at 22:31, Dave Morris <<a href="mailto:dave.morris@metagrid.co.uk">dave.morris@metagrid.co.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Sonia,<br>
<br>
You raised several good suggestions in your email. To avoid confusion <br>
I'll reply to each one in a separate email thread.<br>
<br>
On 2021-06-11 13:31, Zorba, Sonia wrote:<br>
> 7. On the getNode endpoint add parameters to perform paginated <br>
> requests.<br>
> Useful for nodes having too many children.<br>
<br>
Paginated response sounds simple, but it turns out to be complicated to <br>
implement.<br>
<br>
We would need to define a design that does not put a heavy load on the <br>
server, can reliably handle the insertion or deletion of nodes between <br>
requests without producing duplicate rows in the results, and does not <br>
require the use of a relational database to implement it.<br>
<br>
As far as I know, everyone who has looked at this has decided that it is <br>
easier to do it on the client side than on the server side. Perhaps <br>
someone would like to look at this again and propose a definition for <br>
how a paginated response could work?<br>
<br>
For me, I see pagination as a client side display function rather than a <br>
server side data access function. Is there a strong use case for doing <br>
this on the server side ?<br>
<br>
Bear in mind that even if we did define a new property for pagination, <br>
existing version 2.1 services would not understand it. So unless we make <br>
the new property mandatory, everyone adopts the new standard, and we <br>
deprecate the version 2.1 standard, clients would still have to cope <br>
with large responses from version 2.1 services.<br>
<br>
Cheers<br>
-- Dave<br>
<br>
--------<br>
Dave Morris<br>
Research Software Engineer<br>
Wide Field Astronomy Unit<br>
Institute for Astronomy<br>
University of Edinburgh<br>
--------<br>
<br>
</blockquote></div>