x-www-form-urlencoded prohibition

Mark Taylor m.b.taylor at bristol.ac.uk
Thu Jun 6 10:02:37 CEST 2024


Russ,

thank you for this detailed explanation of your thinking about the
future for VO protocols.

Continuously changing API technology to lower the bar for staff
recruitment is a coherent programme, and makes sense in a dynamic
software environment where all the important clients and servers
are regularly renewed or diligently maintained, or at least where
all the {client,server} pairs that want to talk to each other are
upgraded in step.  But that's not the case for the VO, and I
therefore don't believe that avoidable technology changes are a
good idea in our context.

I've already voiced my concern that rewriting a standard like TAP
in a backwardly incompatible way will fragment the client/server
landscape into old software and new software which to at least
some extent will not interoperate.  Doing that once is bad,
doing it multiple times is worse.  If we absolutely have to change
API technology now (I'm not convinced that is the case, but perhaps for
security reasons it is) then so be it, but making a virtue of a
rolling programme of technology change is another matter.

(That doesn't mean BTW that we shouldn't try to write standards in which
the semantics and the encodings are easily separable; I'd say that
is generally good practice where it can be done without violence
to the rest of the definitions, and if we do have to move protocol
again at some point then as you say we're in a better position).

On Thu, 30 May 2024, Russ Allbery via grid wrote:

> Markus Demleitner via grid <grid at ivoa.net> writes:
>
> > In a system like the VO, where old services literally keep running for
> > decades and (sometimes grossly cobbled-together) clients hang around on
> > legacy systems for about as long, I don't think such a mechanism exists.
>
> I think unmaintained legacy clients talking to new services is the use
> case that is the most incompatible with a protocol transition.

New clients are going to be trouble too.  Much as I'd like to think
that topcat will be a client of choice for desktop interaction with
the VO indefinitely, it is likely that new ones will arise, and they
may not be motivated to interoperate with previous iterations of
the protocol stack (the ease of writing clients that talk to the
new services was one of the P3T/OpenAPI selling points).
If you're talking about continuous revolution of service protocols
then likely any client will only be able to talk to the services
developed around the same time as it, and isolated from earlier
and maybe later services.

> It is not possible to do a protocol transition without eventually breaking
> something, almost by definition.

This may look like an acceptable cost to the large well-funded projects,
developing new software, that are the loudest voices in the P3T.
But legacy data providers, having fewer resources to think about these
things, are necessarily less active in the discussion, and science
users of the VO won't know about such changes until things stop working.
If the VO decides to go in this direction, I hope that the impact on
those stakeholders will receive adequate consideration.

Mark


On Sat, 25 May 2024, Russ Allbery wrote:

> Mark Taylor <m.b.taylor at bristol.ac.uk> writes:
> > On Thu, 23 May 2024, Russ Allbery via grid wrote:
> 
> >> That encoding, as discussed below, has different security properties
> >> than RESTful JSON that would need to be discussed, and some sites may
> >> choose not to implement it.  This is fine; that's exactly the point of
> >> having a layered standard.  It gives us a clear place to discuss
> >> general properties of the encoding that apply to any service that uses
> >> that encoding, and it will provide a way for sites to advertise exactly
> >> what they implement.
> 
> > I do not support the idea of drawing up a menu of optional APIs from
> > which different sites can choose which options to offer.
> 
> > The situation at the moment is that any service offering a cone search,
> > and any client wanting to execute a cone search, have exactly one way to
> > talk to each other.  This is how you get interoperability.  The effect
> > of defining API options which "some sites may choose not to implement"
> > is that low-effort clients will just choose the option they find easiest
> > to work with and be unable to work with some fraction of the services,
> > while high-effort clients will have to implement all the different
> > options to support all available services, and would require protocol
> > negotiation of some sort to boot.
> 
> In general, I agree with you that having more than one way to do something
> is not ideal and poses interoperability and implementation problems.
> However, sometimes it's the right way to serve multiple needs, and this
> pattern already exists in IVOA protocols today.
> 
> Consider, for instance, sync and async.  The SODA standard, as one
> example, explicitly allows sites to pick and choose which of sync and
> async to offer.  These are two ways of doing the same thing, but there are
> two options because they're useful in different situations.  Async with
> UWS provides considerably more control and status information about jobs,
> history, etc., but it cannot be driven with a simple GET and requires
> interpreting XML to understand the job status.  Sync provides a simple API
> to the same basic operation that can return a single result in a
> convenient way, usually as a file download.
> 
> I think this is a similar situation.  Paul stated previously in this
> discussion that some IVOA standards have as a goal that it be possible to
> invoke the API from a simple web form without JavaScript.  This
> essentially requires x-www-form-urlencoded POST and a file download (or a
> plain text or HTML page) as a response, since that's all that such a
> browser understands.  But there are many operations for which that
> protocol is overly restrictive or unsuitable: it imposes significant
> restrictions on the structure of the input and output, it is difficult to
> use in situations where the operation takes longer than a browser or web
> gateway timeout, etc.
> 
> We should not add options blindly or excessively.  I completely agree with
> that caution.  But I think saying that there should always be only one way
> to do something is too strict.
> 
> > If a decision is made to ditch x-www-form-urlencoded POSTs in favour of
> > JSON as the required API for TAP et al., I see no benefit, and some
> > harm, in standardising a set of additional optional APIs that
> > communicate the same thing in different ways to sit alongside that.
> 
> There is a broader reason for why I think we should lay the groundwork for
> this now, which is that I think it will make change easier in the future.
> 
> The various folks working on the protocol changes are doing so for a wide
> variety of reasons, so I am very much not speaking for anyone other than
> myself here.  But one of the primary things that I'm trying to accomplish
> is to make it easier to change IVOA standards to follow prevalant patterns
> in web service design outside of astronomy, whatever those may be in the
> future.  And the reason why I care so much about that problem is resource
> constraints.
> 
> Programmers who also know astronomy are an very scarce resource.  I want
> to be as respectful of their time as possible and allow them to focus on
> astronomy-specific problems, since there are already way more of those
> than there are resources to work on.  To me, this implies two things:
> 
> 1. Try not to solve non-astronomy problems with astronomy resources.
>    There are orders of magnitude more people working on general web
>    service frameworks, clients, service platforms, logging
>    infrastructures, authentication mechanisms, etc. than there are
>    astronomical programmers.  Whenever possible, I want to build services
>    on top of those general frameworks, tools, and platforms and thus take
>    advantage of all of that existing energy and resources.
> 
> 2. Where problems do need to be solved with astronomy resources, still try
>    to limit the number of problems that require programmers with
>    astronomy-specific knowledge.  Those folks are much harder to find and
>    hire and most of them are already very busy.  If the problems that
>    require astronomy-specific knowledge can be separated from the more
>    generic problems, the more generic problems can be solved by
>    programmers who either don't need astronomy-specific knowledge or only
>    need to know a smaller set of things that we can easily teach them.
> 
> The implications for web services of those two points, I believe, is that
> IVOA standards should try to separate the portions that are
> astronomy-specific, such as standardization of semantics, data formats,
> and type systems for interoperability of astronomy data, from the portions
> that are "just" astronomy's verison of generic web service problems.  That
> web service protocol layer should then be as much like "what everyone else
> does" as possible, allowing us to use all of those generic tools and
> frameworks that many other people are investing amounts of time and energy
> into that are far vaster than what we can muster.
> 
> The problem this then poses for both standardization and software
> maintenance is that "what everyone else does" is continuously changing.
> When many of the IVOA services were originally designed, that was XML and
> simple GET and POST.  Now, it's REST and JSON, there is much less energy
> and activity going into XML-based web frameworks, and it's already become
> more difficult to find programmers who have XML expertise because it's not
> as widely used of a skill.  But REST and JSON will not be the last
> iteration of "what everyone else does"; there will be something after REST
> and JSON, and REST and JSON will become less-widely-used skills without
> actively maintained frameworks.  And then there will be something else
> after that.  And something else after that.
> 
> I believe the implication is that we should plan for ongoing change of the
> network encoding of IVOA web protocols, the part that web frameworks and
> standard clients and so forth help with.  Right now, detailed knowledge of
> the underlying network protocol is embedded into nearly every web service
> document.  Changing the network encoding therefore requires a revision of
> every service standard, and since many service standards have their own
> quirks and differences, also changing the client and server
> implementations of each standard separately.  If that's the cost we pay
> each time "what everyone else does" changes, we will drown.  We do not
> have the resources to do that.
> 
> Therefore, I think we need to try to separate the astronomy semantics of
> the protocol from the method du jour of encoding those semantics in a
> protocol, and allow archives and clients to transition *all* of their
> services from one network encoding to the next in as painless of a way as
> possible.  This, in turn, requires teasing apart the encoding issues from
> the service definition in the standards so that the encoding layer can be
> generic across services.  Then, we can similarly separate the network
> encoding from the astronomy semantics in code.  When "what everyone does"
> starts to change, we can write a new standard for the new encoding without
> changing any of the service standards, update the glue from the (ideally
> unchanged) astronomy code to the network encoding by utilizing new
> publicly available frameworks and libraries that we didn't have to spend
> scarce astronomy resources on, and thus stay in the "boring center" of the
> broader surrounding ecosystem.  The goal is to be able to move off of the
> old encoding before the support ecosystem for that encoding fades and we
> have to start expending scarce astronomy resources to keep it running.
> 
> Quite a lot of that separation already exists in the IVOA standards today.
> We already have separate data format definitions, for example.  But the
> web service protocols have some distance to go, and to get there will
> require a considerable amount of work.  The type of work, though, is
> roughly the work that would be required anyway for any change to RESTful
> JSON, so we can hopefully tackle both problems at the same time and set
> ourselves up for an easier time of it in the future.
> 
> What I very much do not want to happen is for us to publish new versions
> of each protocol that embed JSON into the service protocols the way that
> XML is embedded now.  That would mean that the next network encoding
> change would require as much work again as this one would.
> 
> To avoid that, we need to test, as early as possible, as many of the
> mechanisms for future change as we can.  One of those will be allowing
> sites to run multiple different encodings of the same service and
> advertise which they're running, because that's one of the pieces needed
> for protocol transitions.  I would not encourage a separate
> x-www-form-urlencoded POST encoding if I didn't think it also solved a
> real problem that people have indicated they care about, but given that it
> does, it is also a valuable test mechanism for the separation of service
> semantics from network encoding, having multiple possible network
> encodings for a service, and being able to advertise this properly so that
> clients can do the right thing.
> 
> -- 
> Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>
> 

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk          https://www.star.bristol.ac.uk/mbt/


More information about the grid mailing list