Proof-of-concept experimental SODA API

Sat May 27 02:30:51 CEST 2023

Hi everyone,

At the last interop, Frossie talked about protocol evolution and mentioned
that I had written a proof-of-concept alternate implementation of our
image cutout SODA service.  Since the sentiment in the room seemed fairly
positive towards thinking about ways that the VO protocols could evolve, I
wanted to expand on that point a bit for those who might be interested.

The background here is that after writing a standards-compliant image
cutout SODA service, I then went back and imagined I was inventing a
protocol to perform the same operations with roughly the same semantics
from scratch, and wrote down the protocol that would have been my first
draft.  You can browse that protocol here:

    https://sqr-063.lsst.io/poc-schema.html

This is reDoc-generated documentation from an automatically-generated
OpenAPI specification, based on the source of that proof-of-concept.  You
can expand a lot of the protocol elements and responses to get more
detailed documentation.

The corresponding source code from which this was generated is at:

    https://github.com/lsst-sqre/ivoa-cutout-poc/

This comes with some substantial caveats:

* This is in no way a proposal for a future IVOA protocol.  I'm sure there
  is critical functionality missing, and many choices here that wouldn't
  be the ones we'd make after a full discussion.  It doesn't represent
  much investment of effort (just a few days of programming) or any deep
  analysis of the problem space.  I think the utility is as a conceptual
  reference point: it provides an idea of what an alternate protocol
  *could* look like, so that there's something concrete to react to.

* The VOSI-availability and VOSI-capabilities APIs, particularly the
  latter, are only placeholders.  I stuffed a few things in there so that
  something would exist but gave them essentially no thought.  Changes to
  those protocols would doubtless not look like this in practice.

* This doesn't provide a GET implementation, even for sync, for reasons
  that are discussed down below in the gory details section.

In this experiment, I tried to stick closely to the following design
principles:

* I wanted to make a completely generic UWS library possible.  In this
  implementation, essentially all the application has to provide is
  Pydantic models for the input parameters and an implementation of the
  backend worker.  Everything else is handled generically by the library,
  including all of the route handlers.  (This was also a goal of the UWS
  library we use for our VO-compliant implementation; it's not a new
  property of this experiment, although I was able to get closer in this
  experiment.)

* Also the same as our VO-compliant service, this separates the web
  service envelope from the image manipulation code using a work queue.
  At Rubin, the image manipulation code runs on an entirely different base
  Docker image that contains our scientific stack, and the web frontend
  uses a much lighter minimal container with only the necessary Python
  dependencies installed.  We're big fans of this separation pattern; it
  allows service experts to maintain the service code and astronomy
  experts to maintain the astronomy code, with a very clear division of
  responsibilities between them.

* All data with internal structure is represented and passed around in
  structured form.  So, for example, rather than having a cutout stencil
  like CIRCLE 0.45 0.12 0.044 (parameters all completely made up, I am not
  an astronomer), it looks like:

      {
          "type": "circle",
          "center": {"ra": 0.45, "dec": 0.12},
          "radius": 0.044
      }

  I tried to carry this through everything else, so (for example) errors
  have separate error codes and messages in structured data.

* The API uses REST verbs, so modifications are done with PATCH to the job
  object, deleting a job is done with DELETE, etc.  (I still used POST for
  sync and async job creation because the input parameters don't match the
  created job object, which made POST feel more appropriate.)

Hopefully this will be interesting to look at as a discussion starting
point.

Implementation details for specific language frameworks are somewhat less
interesting since protocols shouldn't be designed to one specific service
framework, but for anyone else on the list who is also using FastAPI, or
anyone who is interested in the impedence mismatches between standards and
implementations, below are some additional gory details.

The generic UWS library relies on Pydantic GenericModels, but by default
FastAPI doesn't do the right thing with input verification when a generic
model is in use.  It's most of the way there, but the last step of
embedding the parameter type into the model doesn't properly carry through
into the verification and OpenAPI schema.  Thankfully, this is fixable
with a bit of a hack: the call to set up the UWS handlers takes the types
of the sync and async job parameters and replaces the generic types in the
type annotation structure with the concrete types before FastAPI sees
them.  This works quite well.  I wrote up this technique here:

https://github.com/tiangolo/fastapi/issues/5874#issuecomment-1402225571

The other challenge for a generic library is GET.  GET parameters in
FastAPI are configured using arguments to the route handler, and I
couldn't see a good way of translating the parameter type into a GET-based
route.  For this experiment, I avoided the problem and said that
everything has to use POST or other verbs that take a body.  To support
GET, I think one would have to partly break the minimal-code model for the
application and make the application define the GET route itself, calling
underlying UWS library functions.  Making the UWS library build the GET
route seems to be more than FastAPI plus Python can manage.

This isn't a huge drawback, though; our existing VO-compliant cutout
service uses this approach.  It just means you have to write more,
moderately duplicate code when defining a service that accepts GET.  And
to some extent this matches the protocol problem: POST bodies and GET
parameters are two different encodings of the same data, so to support
both you have to define both encodings, rather than having the data have
only one encoding.  (Unless you want to use GET parameters that take encoded
data, such as JSON blobs, but I dislike APIs that do that; the escaping is
ugly and complicated, and I think you lose most of the benefits of
supporting GET in the first place.)

The database underlying the UWS implementation to store job state also
uses a completely generic schema.  It does this by encoding the job
parameters in JSON and storing them in a column of type JSONB in the
database.  (We're using PostgreSQL for our database layer.  This technique
would probably need modifications with other database types.)

The job results are written to an object store by the worker
implementation.  This specific implementation is specific to Google Cloud
Storage for the object store because it uses Google's libraries to make
signed URLs to return to the client, but could be made more generic or
implemented for a different object store, or one could add routes to
return result objects directly.

For the purposes of this proof-of-concept, I didn't implement the database
housekeeping code to, for example, age out old completed jobs, but it
should also be possible to make it generic.

-- 
Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>