x-www-form-urlencoded prohibition

Thu May 23 19:02:20 CEST 2024

Paul Harrison via grid <grid at ivoa.net> writes:

> In the recent P3T session, someone (I think Alberto Micol) pointed out
> that the proposed deprecation of x-www-form-urlencoded POSTs was a big
> change - and I agree it is probably the most disruptive change to
> protocols that is being suggested

In our last discussion, I suggested that one of the encodings that we
could standardize as part of the new framework could be
x-www-form-urlencoded POST.  It sounds like there was a lot of interest in
this specific use case, which makes me further convinced that this would
be a good idea.  That also solves one of the questions we had, namely what
second useful network encoding to define alongside RESTful JSON to ensure
that the framework for multiple encodings worked correctly.

That encoding, as discussed below, has different security properties than
RESTful JSON that would need to be discussed, and some sites may choose
not to implement it.  This is fine; that's exactly the point of having a
layered standard.  It gives us a clear place to discuss general properties
of the encoding that apply to any service that uses that encoding, and it
will provide a way for sites to advertise exactly what they implement.

> It should be noted that the main design principles behind most of the
> current protocols were

> * it should be possible to give someone a URL string that will run the
>   service
> * the service should be controllable/runnable using a web browser
>   without javascript

The second use case is unfortunately not compatible with using JSON as
either the input or the output because web browsers without JavaScript do
not really handle JSON -- not at all for input without browser extensions,
and only awkwardly and with quite poor UI for output.  It would be nice if
there were some way to have our cake and eat it too, but if one wants the
improved structure of JSON input and output for the web API, the cost of
that is that you cannot provide a nice UI to drive the API entirely from a
browser without JavaScript.  A browser without JavaScript only truly
supports x-www-form-urlencoded API input, and HTML, plain text, or file
download as output.

The second use case will therefore require a separate protocol encoding
that is not RESTful JSON.  I think this is a good idea and should be
considered in our scope, and we should try to keep that protocol encoding
as close to the current standard API as we can make it while still being
able to specify it with OpenAPI.

It's probably also worth noting that, depending on how one interprets the
first requirement, it's very challenging for authenticated archives, since
the whole point of requiring authentication is that one needs to know who
"someone" is before allowing them to run the service.  My understanding of
how this requirement is interpreted in the authenticated archive context
is that the reply from the service should contain enough information for
the client to be able to authenticate, but I think people may be
underestimating the difficulty of the general case of this problem and
thereby trying to promise more than we can really deliver.

It's quite difficult to correctly do identity discovery and authentication
starting from *nothing* but a URL string.  There are whole international
foundations devoted to this problem that have achieved only limited
success, and that still usually rely on more input from the user than just
the URL string before they're able to run the service.

> So how did we get to 

> "Simple POSTs with x-www-form-urlencoded— no preflight checks—
> vulnerable to CSRF” being eliminated as bad behaviour.

This has come up in previous discussions and I want to reiterate again
that this is not my position.  My concern was never that the IVOA
standards *allow* x-www-form-urlencoded POST.  This is a perfectly
reasonable thing to support in some situations.  My concern was that the
standards *require* it and arguably rule out, or at least do not specify,
any security mitigations if the security issues are of concern.

I would like to be able to provide a service that provides an
IVOA-standardized protocol, in the specific sense that all of the
information (apart from obtaining authentication credentials) required to
run the service can be determined from the protocol specification, that is
not vulnerable to CSRF.  I don't believe this is currently possible; one
currently has to use mitigations outside the scope of the standard that a
client written purely to the standard would not know how to navigate.

I *don't* want to require that everyone else care about CSRF.  Cross-site
POST is only a concern for a service if two things are true:

1. The service is authenticated.  Unauthenticated services are inherently
   open to requests from anywhere on the Internet so none of this is
   applicable.

2. The service provides some operation that can be triggered by blind POST
   that poses security concerns if anyone on the Internet can do it.

I think part of the reason why we keep coming back to this discussion is
that, for most archives, either (1) or (2) doesn't hold.  Specifically,
for a lot of services, one can walk through all the operations possible
with POST and convince oneself that none of them are particularly
dangerous.  I don't disagree with this!  The problem, however, is that if
one *does* find a service for which (1) and (2) both hold, one now has a
serious problem because the current IVOA stnadards framework doesn't
provide a solution that's interoperable with standard clients, and I feel
comfortable claiming that there will be cases where both (1) and (2) hold.

Part of the problem in having a detailed discussion about this is that the
most obvious examples where (1) and (2) both hold involve denial of
service attacks (overwhelming services with bogus requests, deleting
people's data, etc.), which assume someone out there on the Internet is
sufficiently malicious to want to hurt an astronomical archive for no
particular gain.  People find these examples implausible because, well,
they're fundamentally good people and can't imagine wanting to do such a
thing.  Unfortunately, the history of web security is full of attacks that
were implausible right up until a sufficiently high-profile web site
attracted someone who was sufficiently malicious.  The lesson we took away
from that is to try to make web sites secure against CSRF by default and
only relax those protections when someone has done an analysis and
determined that the benefits of a relaxed policy outweigh the risk for
that site.

> Once you have JSON, you need JavaScript and you open up a whole extra
> attack surface for your web services however compared with the
> pre-javascript era.

I know this is a bit beside the point you're trying to make, but I think
it's important to point out that this attack surface generally already
exists regardless of what the IVOA standards say because the vast majority
of web browsers out there have JavaScript enabled.

There are a few people who disable JavaScript by default and only enable
it when required (I personally do a version of this on my laptop, although
I enable first-party JavaScript by default), and that practice has a lot
to recommend it.  However, I suspect the number of people who go to the
trouble is comfortably below 1% (particularly if you include phone
browsers in the analysis), and I doubt it's going to increase
substantially in the future.  The increased risk therefore exists for most
users regardless, and can be abused by malicious entities even if the
archive carefully avoids anything other than basic HTML.

> However, in the IVOA context services cannot predict where they might be
> called from, so the only sensible CORS response is to allow all origins
> - i.e. adding no extra security - but adding an extra burden on the
> service provider in responding to CORS properly.

I think you're making a lot of assumptions here that are only true in a
world in which the typical API client is a web page loaded in a web
browser.

CORS does not require that you be able to predict where services may be
called from in the general case.  CORS only requires that you be able to
predict which *web sites* drive your service *with client-side
JavaScript*.  Clients that are not web browsers are entirely unaffected by
CORS; the standard is specific to JavaScript running inside a web browser
because the attacks are specific to that environment.  So, for example, we
offer API services to any authenticated user via token authentication from
anywhere in the world, with absolutely no CORS policy other than the
defaults that prohibit all cross-site requests, and this works fine for
all API clients other than client-side JavaScript in a browser.

The scenario that you describe comes into play only if you want to allow a
*web site* on a different domain to make direct API requests to your
service *from the web browser* using client-side JavaScript.  If the
requests are made by the web service underlying the web site, those would
be allowed by a default CORS policy.  Only browser requests are affected.

Now, to be clear, this probably is a use case that many archives want to
support!  We should therefore spell out precisely what CORS policy is
required to support it, but also what security considerations one should
take into account before opening up your API services to be driven by
random untrusted web sites.  But I question whether this is such a
universal desired use case that opening API services in this way would be
the *only* sensible policy.  I think there are other possible sensible
policies.

Therefore, I would argue that this statement...

> however, we learned above that, in the IVOA service context, even when
> the pre-flight checks are done, there is no way that CORS can be used to
> restrict where the services are invoked from, so there is no CSRF
> mitigation from that alone

...is incorrect.  I don't agree that it is true in the IVOA service
context that there is no way to use CORS.

> 1. Do we want to try to standardise on one of the other mitigation
> techniques?

For the x-www-form-urlencoded network encoding, yes, absolutely.  That
would make it possible to provide a POST-based API that can be driven by
simple HTML forms in more situations than is currently possible, by
standardizing what the client needs to do to navigate the CSRF protection.

For a RESTful JSON network encoding, my starting position is no, CORS
should be used as the CSRF mitigation where CSRF protection is required.
I am unconvinced that there is a use case here that cannot be handled
correctly with CORS.  But I'm open to being convinced if someone can
describe a detailed use case where this is an issue.

> 2. Do we still want to remove the x-www-form-urlencoded possibility as
> CORS does not provide any CSRF mitigation in our case?

I think we should specify x-www-form-urlencoded as one of the available
network encoding options for services, and document its limitations and
potential security concerns.

-- 
Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>