[p3t] Supporting protocol migrations in standards

Thu Apr 18 01:04:14 CEST 2024

As promised, here are some starting thoughts on a protocol design pattern
that separates transport from semantics in a way that hopefully makes it
easier to adjust wire protocols in the future. I think this will also go
farther down the path created by DAL and UWS towards providing a framework
that can be used as the basis for site-specific or preliminary protocols
that aren't ready for IVOA standardization.

I'm not wedded to the profile/protocol terminology used here (indeed, I
think it's fairly bad, and would be delighted for someone to come up with
something better). I just needed some terms to try to make the discussion
clearer.

Problem statement
=================

Preferred Internet protocols, encoding formats, and RPC mechanisms evolve
over time. Newer programming languages and frameworks often focus on
supporting only the most recent ones. Even if older approaches are
supported, that support may be awkward or unfamiliar to the typical
implementor and therefore prone to errors. New Internet protocols and
encodings may offer desirable features such as improved performance or
security.

IVOA protocol specifications currently partly entangle the Internet
protocol and encoding with the semantics of an operation. Revising the
encoding thus requires reviewing and revising the entire IVOA protocol.
IVOA protocols also offer only partial building blocks for defining ad
hoc, local, or experimental services that could otherwise take advantage
of generic IVOA clients.

By decoupling the Internet protocol and encoding representation (the
"profile") of a protocol from the specification of the semantics of the
protocol (the "protocol"), we could address both problems. The work
required to adopt a new Internet protocol or encoding is limited to
defining a new profile and the higher-level protocol document can be left
(in most cases) untouched. Non-standardized protocols can still use the
standardized profile, and thus benefit from generic clients and
frameworks. As a side benefit, standards written using this layered
approach will more clearly separate encoding from semantics.

Layering design
===============

The layering can be thought of as an API between the protocol and the
profile. That API consists of the following details, specified by the
protocol:

1. The input parameters and their types.
2. The result or results and their types. There are two types of results:
   ones that combine simple types (bool, int, float, string) with lists
   and dicts, and ones that return a "file" with a MIME type.
3. The standardized error codes for this protocol.

This is in addition to the normal things that a protocol would specify,
such as protocol semantics, registration, etc.

A profile defines the following:

1. How to encode simple input (bool, int, float) parameters.
2. How to encode complex input parameters (list, dict), if that's even
   supported in this profile.
3. How to upload files as input parameters for protocols that need that.
4. How to send a request and receive a response (the transport layer,
   currently always HTTP).
5. How to encode output composed of simple types. We'll use this for
   things like availability or other types of replies that are mostly
   key/value pairs with simple structure. This is also the basis for the
   UWS protocol.
6. How to encode "file" output with its own data type, specified as a MIME
   type. This is how things such as FITS files and VOtables are handled.
7. How transport/infrastructure/low-level errors are reported and what
   clients should expect from them, such as HTTP 500 or 401 errors. At
   present all of our profiles are HTTP-based, so this would probably be
   the same for all of our initial profiles.
8. How higher-level errors are reported and encoded.

The idea is that the combination of any profile document and any protocol
document gives you a complete protocol that you can implement. Porting
IVOA protocols to a new Internet protocol or encoding format is ideally
only a matter of writing a new profile document, and all the protocol
documents can then be reused unmodified. In practice, this won't be
completely true -- there is always some wrinkle -- but it would be a lot
closer to true than we are now.

I've omitted a mention of authentication here, but for a fully generic
profile layer, a large part of the authentication would need to be
specified in the profile. Currently, IVOA authentication is very
HTTP-specific and I *don't* think we should tackle that any time soon (or
at all until someone really needs something different). Instead, we can
probably start by assuming all profiles are HTTP-based. But the full
technical vision here would move authentication into the profile to allow
for something like gRPC.

UWS
===

UWS is built on top of the profile and uses its simple type encoding for
its protocol responses. As now, it wraps "file" responses in another
layer of indirection as part of the job result, so it can support multiple
responses for a single job.

I think it makes the most sense to write UWS as a protocol built on top of
profiles, but I did want to note that there is a counter-argument that UWS
behavior should be defined in the profile to leave design space open in
the future for the profile to take advantage of native underlying support
for async operations. For example, gRPC has built-in async request
support, so a gRPC profile may want to use that for certain types of async
operations that would otherwise use UWS, although I think there's a bit of
a semantic mismatch and gRPC's async request model is intended for
requests with a much shorter lifetime than UWS jobs.

Starting profiles
=================

One can (although not trivially because they weren't meant for this) think
of existing IVOA protocols as being built on four profiles:

- REST GET, which uses HTTP and defines an encoding of input parameters to
  query parameters, an output encoding to XML, and an error encoding to
  (this is messy because existing IVOA protocols vary) text/plain or
  VOtable errors.

- REST POST, which uses HTTP and defines an encoding of input parameters
  to application/x-www-form-encoded, an output encoding to XML, and an
  error encoding to text/plain or VOtable errors.

The new JSON-based protocol we've discussed so far would define two
additional profiles:

- REST JSON GET, which uses HTTP and defines an encoding of input
  parameters to query parameters and an output and error encoding to JSON.

- REST JSON POST, which uses HTTP and defines an encoding of input
  parameters, output, and errors to JSON.

gRPC would look something like this:

- gRPC, which uses the gRPC RPC mechanism (which is HTTP/2 under the hood
  but with a complex protocol layer on top, so servers and clients use a
  gRPC library directly and the HTTP/2 part is hidden), and would define
  input, output, and error encodings as protobufs.

This wouldn't quite work right now, as discussed above, since gRPC doesn't
use HTTP authentication mechanisms.

I think it's an interesting question whether we want to define a way to
encode complex input parameters with REST JSON GET or simply say that you
have to use REST JSON POST if you have complex input. The latter would be
my preference just for simplicity.

Another possible approach that may be less awkward is to have each profile
define a simple request and a full request encoding to capture that GET
vs. POST distinction, but I fear that may be too specific to simple HTTP
requests and get in the way down the road, which is why I like the idea of
separating the GET and POST profiles. However, the way I laid it out above
means that everything is technically defined under the POST versions of
the protocol, even when that would be silly (availability request, for
example).

VOSI
====

Like UWS, I think VOSI can be thought of as an intermediate layer that
sits between the profile and the protocols, and protocols just specify
their VOSI information in much the same way that they specify that they
can be used with UWS.

I would use the encoding layer of the profile for the information returned
by the availability endpoint, so it would be XML under the REST GET
profile and JSON under the REST JSON GET profile.

Whether to do that with capabilities or define capabilities as always
returning an XML document is a more complex question, since that XML
response is tied into the VOresource definition and has fairly complex
semantics. On one hand, reinventing that in JSON is kind of annoying; on
the other hand, requiring essentially every protocol implement this one
thing that's in XML even if the rest of the protocol uses a profile that
uses JSON or protobufs or Cap'n Proto or something else is a bit annoying
from a tooling and mental complexity perspective. (I suspect we will have
VOtable for a very long time, but there are a lot of IVOA protocols and
potential local or ad hoc protocols that don't need to use VOtable.)

-- 
Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>