x-www-form-urlencoded prohibition

Russ Allbery eagle at eyrie.org
Sat May 25 02:14:33 CEST 2024


Mark Taylor <m.b.taylor at bristol.ac.uk> writes:
> On Thu, 23 May 2024, Russ Allbery via grid wrote:

>> That encoding, as discussed below, has different security properties
>> than RESTful JSON that would need to be discussed, and some sites may
>> choose not to implement it.  This is fine; that's exactly the point of
>> having a layered standard.  It gives us a clear place to discuss
>> general properties of the encoding that apply to any service that uses
>> that encoding, and it will provide a way for sites to advertise exactly
>> what they implement.

> I do not support the idea of drawing up a menu of optional APIs from
> which different sites can choose which options to offer.

> The situation at the moment is that any service offering a cone search,
> and any client wanting to execute a cone search, have exactly one way to
> talk to each other.  This is how you get interoperability.  The effect
> of defining API options which "some sites may choose not to implement"
> is that low-effort clients will just choose the option they find easiest
> to work with and be unable to work with some fraction of the services,
> while high-effort clients will have to implement all the different
> options to support all available services, and would require protocol
> negotiation of some sort to boot.

In general, I agree with you that having more than one way to do something
is not ideal and poses interoperability and implementation problems.
However, sometimes it's the right way to serve multiple needs, and this
pattern already exists in IVOA protocols today.

Consider, for instance, sync and async.  The SODA standard, as one
example, explicitly allows sites to pick and choose which of sync and
async to offer.  These are two ways of doing the same thing, but there are
two options because they're useful in different situations.  Async with
UWS provides considerably more control and status information about jobs,
history, etc., but it cannot be driven with a simple GET and requires
interpreting XML to understand the job status.  Sync provides a simple API
to the same basic operation that can return a single result in a
convenient way, usually as a file download.

I think this is a similar situation.  Paul stated previously in this
discussion that some IVOA standards have as a goal that it be possible to
invoke the API from a simple web form without JavaScript.  This
essentially requires x-www-form-urlencoded POST and a file download (or a
plain text or HTML page) as a response, since that's all that such a
browser understands.  But there are many operations for which that
protocol is overly restrictive or unsuitable: it imposes significant
restrictions on the structure of the input and output, it is difficult to
use in situations where the operation takes longer than a browser or web
gateway timeout, etc.

We should not add options blindly or excessively.  I completely agree with
that caution.  But I think saying that there should always be only one way
to do something is too strict.

> If a decision is made to ditch x-www-form-urlencoded POSTs in favour of
> JSON as the required API for TAP et al., I see no benefit, and some
> harm, in standardising a set of additional optional APIs that
> communicate the same thing in different ways to sit alongside that.

There is a broader reason for why I think we should lay the groundwork for
this now, which is that I think it will make change easier in the future.

The various folks working on the protocol changes are doing so for a wide
variety of reasons, so I am very much not speaking for anyone other than
myself here.  But one of the primary things that I'm trying to accomplish
is to make it easier to change IVOA standards to follow prevalant patterns
in web service design outside of astronomy, whatever those may be in the
future.  And the reason why I care so much about that problem is resource
constraints.

Programmers who also know astronomy are an very scarce resource.  I want
to be as respectful of their time as possible and allow them to focus on
astronomy-specific problems, since there are already way more of those
than there are resources to work on.  To me, this implies two things:

1. Try not to solve non-astronomy problems with astronomy resources.
   There are orders of magnitude more people working on general web
   service frameworks, clients, service platforms, logging
   infrastructures, authentication mechanisms, etc. than there are
   astronomical programmers.  Whenever possible, I want to build services
   on top of those general frameworks, tools, and platforms and thus take
   advantage of all of that existing energy and resources.

2. Where problems do need to be solved with astronomy resources, still try
   to limit the number of problems that require programmers with
   astronomy-specific knowledge.  Those folks are much harder to find and
   hire and most of them are already very busy.  If the problems that
   require astronomy-specific knowledge can be separated from the more
   generic problems, the more generic problems can be solved by
   programmers who either don't need astronomy-specific knowledge or only
   need to know a smaller set of things that we can easily teach them.

The implications for web services of those two points, I believe, is that
IVOA standards should try to separate the portions that are
astronomy-specific, such as standardization of semantics, data formats,
and type systems for interoperability of astronomy data, from the portions
that are "just" astronomy's verison of generic web service problems.  That
web service protocol layer should then be as much like "what everyone else
does" as possible, allowing us to use all of those generic tools and
frameworks that many other people are investing amounts of time and energy
into that are far vaster than what we can muster.

The problem this then poses for both standardization and software
maintenance is that "what everyone else does" is continuously changing.
When many of the IVOA services were originally designed, that was XML and
simple GET and POST.  Now, it's REST and JSON, there is much less energy
and activity going into XML-based web frameworks, and it's already become
more difficult to find programmers who have XML expertise because it's not
as widely used of a skill.  But REST and JSON will not be the last
iteration of "what everyone else does"; there will be something after REST
and JSON, and REST and JSON will become less-widely-used skills without
actively maintained frameworks.  And then there will be something else
after that.  And something else after that.

I believe the implication is that we should plan for ongoing change of the
network encoding of IVOA web protocols, the part that web frameworks and
standard clients and so forth help with.  Right now, detailed knowledge of
the underlying network protocol is embedded into nearly every web service
document.  Changing the network encoding therefore requires a revision of
every service standard, and since many service standards have their own
quirks and differences, also changing the client and server
implementations of each standard separately.  If that's the cost we pay
each time "what everyone else does" changes, we will drown.  We do not
have the resources to do that.

Therefore, I think we need to try to separate the astronomy semantics of
the protocol from the method du jour of encoding those semantics in a
protocol, and allow archives and clients to transition *all* of their
services from one network encoding to the next in as painless of a way as
possible.  This, in turn, requires teasing apart the encoding issues from
the service definition in the standards so that the encoding layer can be
generic across services.  Then, we can similarly separate the network
encoding from the astronomy semantics in code.  When "what everyone does"
starts to change, we can write a new standard for the new encoding without
changing any of the service standards, update the glue from the (ideally
unchanged) astronomy code to the network encoding by utilizing new
publicly available frameworks and libraries that we didn't have to spend
scarce astronomy resources on, and thus stay in the "boring center" of the
broader surrounding ecosystem.  The goal is to be able to move off of the
old encoding before the support ecosystem for that encoding fades and we
have to start expending scarce astronomy resources to keep it running.

Quite a lot of that separation already exists in the IVOA standards today.
We already have separate data format definitions, for example.  But the
web service protocols have some distance to go, and to get there will
require a considerable amount of work.  The type of work, though, is
roughly the work that would be required anyway for any change to RESTful
JSON, so we can hopefully tackle both problems at the same time and set
ourselves up for an easier time of it in the future.

What I very much do not want to happen is for us to publish new versions
of each protocol that embed JSON into the service protocols the way that
XML is embedded now.  That would mean that the next network encoding
change would require as much work again as this one would.

To avoid that, we need to test, as early as possible, as many of the
mechanisms for future change as we can.  One of those will be allowing
sites to run multiple different encodings of the same service and
advertise which they're running, because that's one of the pieces needed
for protocol transitions.  I would not encourage a separate
x-www-form-urlencoded POST encoding if I didn't think it also solved a
real problem that people have indicated they care about, but given that it
does, it is also a valuable test mechanism for the separation of service
semantics from network encoding, having multiple possible network
encodings for a service, and being able to advertise this properly so that
clients can do the right thing.

-- 
Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>


More information about the grid mailing list