thoughts on TAP-1.2

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Mar 28 11:03:44 CEST 2022


Hi Pat,

On Fri, Mar 25, 2022 at 02:09:17PM -0700, Patrick Dowler wrote:
> astronomical catalogues they create and curate. To that end, the tables are
> added to the tap_schema and visible in the /tables endpoint. There is

All tables for all users?

I'm mainly asking because I'd like to do this to let "normal" people
do rather ephemeral (but yet persistent for a few days or weeks)
uploads, and I'm 100% sure nobody else should or want to see them.

That's not *much* of a problem for /tables, but for tap_schema it is
*quite* a complication, and hence I'm curious what you do and how it
works for you.

> access control that the users/projects control so they can control who can
> the group can see/query them) until the project publishes a paper, at which
> point they would make the table publicly queryable.

Rephrasing my query above: Is it just the data that's hidden or the
metadata, too?

> We do not (yet) put table metadata into the registry so I haven't thought
> that bit through, but probably only public tables should go there and I'd
> probably make it an additional manual step to "publish" (to registry) and

At this point I'm a lot less worried about the registry than about
what clients get from /tables and tap-schema.

However, with my Registry hat on let me briefly state that for me it
sucks if your tap_schema is different from what you give the
registry, as that will give everyone a lot of headache when, one day,
we want to move from GloTS (which harvests tap_schema if it can) to a
proper Registry approach.

> If you look at the details of the bulk loading,  you see that it is a
> streaming operation that directly inserts rows into the database. There's a

Our of curiosity (not closely related to much anything): You're not
batching these inserts?  And that's performing well?

> clients could automatically recover from content failures. It's hard to
> push 500e6 rows into a database table without failures, but that's what

If find it remarkable that you seem to spend quite a bit of effort
on defeating transactionality -- that's really what your users
wanted?  Half-uploaded tables?  How does that work technically?  Are
you really inserting these things outside of database transactions?

> -- Definitely interested in more use cases for user-generated database
> content...

Well, as hinted above what I'm really after is

  SELECT 
  INTO my_schema.result_table
    ra, dec, foo, bar
  FROM some.tap_table
  WHERE...

That is, people shouldn't need to download their results if they'd
like to reuse them later within my database.

Letting users upload their favourite objects or whatever for multiple
use would be another rather attractive thing.  I think both cases
have been considered quite early on, and the real reason I'm rather
keen on having optional and interoperable auth has always been that
I'd like to make these happen.

Well, except I'm still not sure how this will work with /tables and
tap_schema, which is why I'm so curious about how you do it.

> approach that can be immediately queried via the TAP API. We do have a
> complete vospace service (vault) that could accept/stage catalogue content
> and we did look at those heady ideas but it is at least as complex or maybe
> more so. That's the primary road block for the "vospace" ideas and as far
> as I am aware, no one has ever made it work. We stopped thinking about that

As someone who's never really touched VOSpace except to review the
spec, I feel no particular attachment to it; but as usual I'd like to
make my "think of the clients" pitch: If it's just "as complex"
server-side, or perhaps "just a little more complex", we really ought
to avoid introducing another API for something that, at least to
clients, appears to do the same thing (uploading data and managing
access rights).  It's not nice to ask VO library authors to implement
two APIs where one might *reasonably* do.

Now, having reviewed the spec, I'll certainly not complain if I won't
have to implement VOSpace, so if you say you've established it won't
work, I'll take your word for it.  I'd still advocate that whatever
VOSpace features we replicate here (rights management comes to mind)
should at least look to clients as much as VOSpace as possible.

           -- Markus


More information about the dal mailing list