Persistent TAP uploads

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Oct 14 16:59:04 CEST 2024


Dear Colleagues,

I have lobbied for persistent uploads -- i.e., you upload a table as
with UPLOAD, but the table remains server-side after the request,
which it doesn't for DALI UPLOADs -- in TAP for a long while now, but
always procrastinated on it because it *somehow* needs federated auth
to be any fun.

Well... by now I figured *something* is better than nothing at all,
and so I went ahead and put a (functional) sketch of persistent
uploads into DaCHS.  In that scheme, somewhat inspired by Pat's
2018 youcat talk
(https://wiki.ivoa.net/internal/IVOA/InterOpNov2018DAL/tap-youcat.pdf;
but that's more a publication interface than the user facility I have
in mind), there is a sibling endpoint to /capabilities, /tables,
/sync, and /async, and you can PUT or DELETE tables which then appear
in or disappear from a magic schema TAP_USER; GET requests there
return VOSI-style table metadata, and the proposal would be that both
/tables and TAP_SCHEMA are unaffected by user uploads.  I think I
have good reasons for that, such as: Clients need *some* changes to
support persistent uploads anyway; keeping your user tables out of
the potentially long public table listings is probably a nice
service; you don't want to make people reload /tables too often,
because it's a fairly large beast on several interesting services.

The implementation also features an ADQL extension: you can prefix a
query with CREATE TABLE <name> AS and create a table in TAP_USER,
too.  Whether that would be a standard facility I can't judge; that
particular thing was relatively messy in implementation, because
suddenly the ADQL parser needs to know the requesting user, which in
DaCHS it so far has not.

Anyway, I have prepared a blog post with a few more details; I might
build that into an IVOA note after Malta depending on feedback there.
Meanwhile:
<https://blog.g-vo.org/a-proposal-for-persistent-tap-uploads.html>;
there's also a Jupyter notebook in there that you use to play around
with the existing uploads (note, however, that the server has a
planned downtime on Wednesday morning CEST).

In the post, I am also giving two points on where I see the most
pressing open issues:

(a) management interfaces: extending table lifetimes (probably easy);
letting users create indexes (probably pretty hard).

(b) authz: Should we mandate a facility through which users can share
their tables with other users?

On (b), my assessment is that this is so much trouble -- including
GDPR trouble in case we let people discover who else has accounts
where -- that I'd tell people: "Well, do a select * from tap_user and
mail your colleague the result; or just send them your credentials."

But then I admit I tend to have little patience for that kind of
thing, and so I'd be grateful for proposals on authz.  But I'd be
much more grateful for a good interface to requesting indexes.  Does
anyone perhaps even have such an interface already?

Thanks,

            Markus



More information about the dal mailing list