[p3t] Updated draft documents

Thu Sep 5 02:01:17 CEST 2024

I've pushed an updated version of all three draft documents to include
most of the additional work I had been talking about.

As with the initial drafts, these documents exist primarily to create a
concrete starting point for discussion about how new protocol documents
could be laid out.

As a secondary goal, I also sketched out a possible UWS API that follows a
hypothetical new API and data model a bit more cleanly (at least in my
opinion), taking advantage of the fact that any new protocol necessarily
requires client changes and some other changes could be made at the same
time.  I also added some half-formed thoughts about various protocol
elements that we've talked about so that I have something written down for
future discussions.

Links to the documents and a summary of changes:

https://sqr-091.lsst.io/

    Added a generic UWS protocol that individual web services can use by
    reference.  Some possibly interesting points here: move waiting for a
    phase change to a separate endpoint to avoid the need for magic
    sentinel values in parameters, prefer PATCH and returning the whole
    object to modifying individual attributes with separate routes, allow
    individual results to actually be errors that are flagged as such, and
    using the data type model so that the parameters are a generic object
    defined by the web services.

    Restructured the initial list of types to be (hopefully) a bit more
    readable.

    Disallow empty request bodies for some operations.  This is a somewhat
    awkward point due to the layering of protocols (only some protocols
    will need this for security or protocol reasons, but we therefore
    force it on all of them).  More about layering in notes below.

https://sqr-092.lsst.io/

    Added a very basic specification for using content negotiation to
    determine the type of a data response.

    In writing this, an obvious gap occurred to me: UWS jobs.  The client
    may want to specify a different data format for the individual
    results, but HTTP content negotiation doesn't really provide a way to
    do that unless the client should do content negotiation when
    retrieving the results from the completed job.  My intuition is that
    this is later than we may want this information when writing the
    service, so we may still need a non-HTTP-content-negotiation method of
    content negotiation for UWS jobs.

https://sqr-093.lsst.io/

    Also specify a UWS-based cutout protocol that can take multiple IDs
    and multiple stencils and return all of the desired results.  This
    exists mostly to be an example of how to reference the UWS
    specification.

In writing this all out, I think it's become obvious that there are some
tricky layering issues that we need to sort out and decide how to specify:

* How to name the operations that a service can perform in a way that lets
  a client know how to request that operation.  Relative URLs are a very
  HTTP-specific concept and are not a natural fit for a lot of possible
  network protocols (gRPC just has named APIs that are not partial URLs,
  something pub/sub-based like Kafka would not use this idea of paths at
  all and would probably have separate request message types per
  operation, etc.).  Right now, I have written down the HTTP paths at the
  service specification level, but they're really specific to the
  combination of the service specification and the network encoding and
  other network encodings will not use them.

* Some protocol-specific stuff leaks up to the service specification or at
  least the instructions for how to write service specifications: the type
  of request (affects HTTP verbs, may be irrelevant to other protocols),
  whether GET is safe to use for queries, certain types of requests need
  to have non-empty bodies, etc.  This isn't necessarily a problem, but it
  can feel a bit awkward.

* Linkages between APIs are very entangled with the network protocol
  encoding.  For example, when retrieving a job list, it is natural for
  HTTP REST-based protocols to return a list of URLs to the API for
  retrieving each job in full.  But this would be a meaningless concept
  for a Kafka-based protocol, and should use job IDs instead for a
  gRPC-based protocol.  Another, less-obvious example is the results from
  a UWS job: currently, this is a URL to retrieve the result, but that
  assumes you go out of whatever network protocol you're using and use
  HTTP to get one of the critical elements of the result, which may
  require a completely different authentication scheme.  (We already have
  had numerous problems with this today with the existing UWS protocol.)

-- 
Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>