GWS II discussion topics - Security

Paul Harrison paul.harrison at manchester.ac.uk
Wed May 24 13:47:25 CEST 2023


Hi Russ, 

I have expanded on a few points below - but chopping a fair bit of intervening text out to make it easier to follow

> On 23 May 2023, at 17:28, Russ Allbery <eagle at eyrie.org> wrote:
> 
> Paul Harrison <paul.harrison at manchester.ac.uk> writes:
> 
>> We we encouraged not to discuss the specifics of
>> https://sqr-063.lsst.io/ at the interop, but there was one general topic
>> that should have been discussed, because as a result of that note there
>> is a perception that the “VO is insecure” - this is a dangerous
>> reputational message to be circulating that I believe is based on an
>> incorrect analysis and is only true in the sense that “the Internet is
>> insecure”.
> 
> I want to be very clear what SQR-063 is: it's a collection of informal
> notes from someone (me) new to the IVOA protocols of things I noticed
> while writing an implementation.  It's not anything more than that, and in
> particular it was never intended to be a formal security analysis.  The
> heading in SQR-063 is "Security concerns" quite intentionally; these are
> just things I was concerned about over a year ago when I wrote a SODA
> implementation.  Concerns are not vulnerabilities and may not even be
> correct (as indeed one was not).
> 
> I am in absolutely no way asserting that "VO is insecure" or anything like
> that.  I have serious professional objections as a security person to that
> sort of statement about any protocol. 

I can see that you get the subtleties here, but unfortunately I think that there is a danger that others 
are interpreting your document in a more absolutest fashion - I am pretty sure that I did hear the phrase
“the VO is insecure” during the https://wiki.ivoa.net/twiki/bin/view/IVOA/InterOpMay2023GWS session, though it might have only been in a conditional clause.
> 
>> * Secondly the whole point of the VO was to make data interoperable in a
>>  public way.
> 
> A quick aside on this, since Rubin has a couple of constraints here.  My
> background isn't in astronomy, so I don't know how unusual they are.
> 
> * The US taxpayers via the US government have decided that our data can't
>  be public for some time after it's gathered.  

Having a proprietary period on data very common (majority?) situation amongst observatories. My comment was that VO protocols were originally designed for the “public archive” situation after that initial period.  I do think that developments like science platforms and FAIR mean that is should be a principle that astronomers do not have to work differently to access data in the “proprietary” or “archive” situations. Having said that I think that my first principle that security is orthogonal to the VO protocol specifications should still be an aim, and I believe achievable for the technical reason I gave about the parts of the HTTP protocol that security concerns are implemented in.
> 
> 
>> 3.1 - This whole argument is made from a very 'browser client only'
>> perspective. As stated above in the second general principle of course
>> VO protocols can be called from anywhere, so they will be inherently
>> susceptible to “CSRF attacks” - that is normal usage. Of course it would
>> be a bit surprising if when clicking on a picture of a cute cat an
>> astronomer ran a TAP query, but when using Topcat they do want to be
>> able to query multiple servers in different locations.
> 
> I'm not sure that I successfully communicated the point of this section if
> you're thinking of Topcat requests as cross-site requests.  The discussion
> is about browsers because cross-site requests are a concept specific to
> web browsers.  They don't normally apply to non-browser clients; no Topcat
> request would be a cross-site request in the normal HTTP sense, at least
> unless I'm wildly wrong about how Topcat works.

I was being facetious with the language, but the real point is that a VO service that is co-hosted with
a web portal for an observatory should not only be expecting calls from the portal.


> 
> The background here for both 3.1 and 3.2 is that we plan to enable, via
> IVOA protocols or extensions that are faithful to the protocols, various
> operations that may be quite expensive (eight-hour TAP queries, large
> batch image processing) or destructive (user table deletion).  I don't
> want it to be possible to trigger such actions via cross-site requests,
> mostly because of the risk of denial of service attacks.  Those don't have
> to be malicious and in fact often aren't; think of, for instance, web
> crawlers that don't honor robots.txt (sadly more common than any of us
> would like), or some JavaScript-triggered request that gets into a refresh
> loop and spams requests.

I think that the only foolproof way to prevent this is to always require authentication -
that might be becoming more acceptable to the VO world nowadays.

> 
> The most common way that web services disallow cross-site requests in
> these situations is that the protocol uses PUT, PATCH, or DELETE, which
> inherently force a JavaScript single-origin policy, or uses POST with body
> content type that isn't one of the white-listed simple request types.
> However, the IVOA protocols use GET or POST with
> application/x-www-form-urlencoded, both of which are classified as simple
> requests, so that method of preventing cross-site requests isn't
> available.

There is quite a history to UWS - my preference was not to have had the GET or application/x-www-form-urlencoded parameters - however, the consensus was to include them -  I still think that security is not a UWS concern though - basically because there is no CSRF prevention mechanism that is acceptable that alters the POST body (as you set out below)

> 
> The next most common way to disallow cross-site requests is to require
> that all such requests contain some random token in both the form body and
> a cookie.  (OWASP calls this the "double submit cookie" technique, which I
> am not fond of as a name, but standard terminology is good.)  This works
> great for things intended to be used via a browser, but as you point out
> IVOA protocols mostly aren't, so naively using this would break Topcat and
> similar clients that wouldn't know how to set the cookie (nor should
> they).  Similarly, the "synchronizer token pattern" requires reading a
> token from the server and reflecting it in the form submission, but the
> IVOA UWS protocol has no way to tell a client that's happening and
> existing clients don't know how to do it.
> 
> The variant that's often used for protocols where most requests are
> expected to be from non-browser clients is the custom request header
> approach, where all state-changing requests are required to include a
> header containing a token that was obtained from an earlier API call.
> However, this requires the client understand this protocol well enough to
> include that header, so again, we're concerned about breaking existing
> clients.
> 
> One simple variation of the custom request header approach that we could
> use here is to require every request contain an Authorization header.
> This forces single-origin policy and thus prevents CSRF in exactly the
> same way the custom header approach does, and it works fine with Topcat,
> which already knows how to send Authorization headers and will be sending
> them in the normal case for Rubin during the period where we have to
> require authentication for data access.  However, the drawback of this
> approach is that it prohibits using simple forms to make UWS requests; you
> *have* to use a client like Topcat (or at least curl).  This makes it
> harder to make ad hoc UIs, although that may be an acceptable price to
> pay.
> 
> I'm not sure the best way to tackle this, but I can think of a few
> possibilities.  Some of these conflict with each other, so we'd also need
> a way to clearly communicate (presumably via the registry) which is in
> use.
> 
> * Obviously protocols using bodies in any format other than text/plain,
>  multipart/form-data, or application/x-www-form-urlencoded force the
>  JavaScript single-origin policy and avoid nearly all CSRF problems as
>  long as one enforces, on the server, the presence of a correct
>  Content-Type header in the request.  But of course that's a significant
>  protocol change, and while it may have merits for other reasons, I
>  wouldn't advocate a change of request serialization protocol solely for
>  CSRF protection.  (This also prohibits using simple forms.)
> 
> * In many cases, it may make sense for a service to require an
>  Authorization header and not allow cookie-authenticated browser access
>  at all.  I believe this case may already be covered by IVOA SSO
>  protocols, and the only thing preventing us from using that approach was
>  having an SSO profile for bearer tokens, which I believe is being worked
>  on.

I think that the conclusion is that the only measures that can be used to try to 
mitigate CSRF attacks are header based tokens, and generating and transporting these securely are basically equivalent to SSO protocols.

However, I think that it is unlikely to be acceptable for a long time that an astronomer is required to log-on to the VO, mainly because a globally acceptable identity federator is difficult to agree on.

I think that an observatory that wants to use VO protocols to distribute their proprietary data will have to do that on a different authenticated endpoint to the “open” data. This still leaves the problem of trying to mitigate CSRF on the open data endpoints.


> 
> * Currently, my understanding is that DALI *requires* a compliant server
>  to support both GET and POST.  While both GET and POST requests have the
>  same theoretical security properties, in practice GET requests are much
>  more likely to produce those unintentional denial of service attacks I'm
>  the most worried about, and there are some cases where I don't think we
>  will have any obvious need to support GET.  In those cases, it would be
>  nice to have a way to say "this service is POST-only" so that clients
>  will know to not attempt GET.  (This was the point of 3.2; the entire
>  request there is just that servers not be *required* to support GET to
>  be standard-compliant.)

This might be the easiest area to try to push on - though it should be noted that there are consequences for other things like DataLink - though there are mechanisms there for parameter passing outside the url query parameters, which might actually lower the level of desire for everything to be captured in the URL query parameters. The DataLink could point at the equivalent POST.

It might even be acceptable to return a DataLink response for any IVOA protocol….

Paul.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/grid/attachments/20230524/75de38d2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2893 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/grid/attachments/20230524/75de38d2/attachment-0001.p7s>


More information about the grid mailing list