Authentication and DataLink

Fri Jan 17 19:09:04 CET 2020

I don't know if this helps you much, but I think it's probably worth referencing the handling of realms ("protection spaces") from rfc7235:

https://tools.ietf.org/html/rfc7235#page-5

   The "realm" authentication parameter is reserved for use by
   authentication schemes that wish to indicate a scope of protection.

   A protection space is defined by the canonical root URI (the scheme
   and authority components of the effective request URI; see Section<https://tools.ietf.org/html/rfc7230#section-5.5>
   5.5 of [RFC7230]<https://tools.ietf.org/html/rfc7230#section-5.5>) of the server being accessed, in combination with
   the realm value if present.  These realms allow the protected
   resources on a server to be partitioned into a set of protection

   spaces, each with its own authentication scheme and/or authorization
   database.  The realm value is a string, generally assigned by the
   origin server, that can have additional semantics specific to the
   authentication scheme.  Note that a response can have multiple
   challenges with the same auth-scheme but with different realms.

   The protection space determines the domain over which credentials can
   be automatically applied.  If a prior request has been authorized,
   the user agent MAY reuse the same credentials for all other requests
   within that protection space for a period of time determined by the
   authentication scheme, parameters, and/or user preferences (such as a
   configurable inactivity timeout).  Unless specifically allowed by the
   authentication scheme, a single protection space cannot extend
   outside the scope of its server.

   For historical reasons, a sender MUST only generate the quoted-string
   syntax.  Recipients might have to support both token and
   quoted-string syntax for maximum interoperability with existing
   clients that have been accepting both notations for a long time.

In addition to that, I believe it's also the case that, if you first login for https://example.com/protected/ in a browser, the browser will forward the authorization header for everything under that domain by default after that, but if you try to go to https://example.com/protected (even without the "/"), it won't forward that information on the first request, nor would it forward if it was https://example.com/. I believe it _should_ forward it those domains those requests returned a "401" with the same realm, although I haven't verified this.

I would also note that I think this also means that authentication for a domain https://example.com, would be expected to not be valid for a subdomain https://protected.example.com and not forwarded.

It's probably also worth referencing some of the HTTP state management RFC too, with respect to how cookies are handled (and some of the semantics in there around subdomains as well):
https://tools.ietf.org/html/rfc6265#section-4.1.2.3

Brian

On Jan 17, 2020, at 5:55 AM, Mark Taylor <M.B.Taylor at bristol.ac.uk<mailto:M.B.Taylor at bristol.ac.uk>> wrote:

Hi GWS (and maybe lurking DAL people),

I have a question about how authentication is supposed to work
with DataLink (and possibly similar services), related to some
experimentation I'm doing with the Gaia archive.

In Gaia's case there is an authenticated TAP service, which returns tables
that may have a datalink_url column pointing at DataLink resources.
The DataLink resources themselves also require authenticated access.
As currently implemented, the Gaia service requires *different*
credentials (separate cookies) for the TAP and DataLink services,
though even if the authentication was the same I see difficulties.

My prototype auth-capable TOPCAT negotiates authentication when
the user chooses a TAP service: it finds out what auth methods
are available from the tap/capabilities file, offers that choice
to the user, and asks for credentials as appropriate.  It then
takes care to use these credentials for subsequent interactions
with that TAP service.  There are a few things to iron out still,
but the basic model can be made to work.

However, DataLink, at least as used from TOPCAT, isn't like that.
The user doesn't select a DataLink service from a list and then
declare that they want to start interacting with it.
Rather a URL that points at a DataLink service gets used as a
source of tables in some other context.  Typical usage:
the user configures an "activation action" that causes the
table referenced by the datalink_url column to get loaded into
TOPCAT when a table row is selected
(http://www.starlink.ac.uk/topcat/sun253/LoadTableActivationType.html).
In this case, as far as TOPCAT's concerned this is just a URL pointing
at a table, and it doesn't know either that it's from a DataLink service
or that it's associated with given TAP service (with particular
authentication).  So it doesn't know what authentication to use,
or even that it is supposed to retrieve it using authenticated access
(until it gets an access error).

This problem has only recently occurred to me.  I have some half-baked
ideas about how to tackle it, but they all seem problematic.
I might be missing something obvious.  Is there somebody with a clear
idea of how they would expect this to work, in particular from a
user experience point of view?

Thanks

Mark

--
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk<mailto:m.b.taylor at bris.ac.uk> +44-117-9288776  http://www.star.bris.ac.uk/~mbt/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/grid/attachments/20200117/e68528ff/attachment-0001.html>