[p3t] P3T mtg - Mon, Jun 24 @20 UTC

Sat Jun 15 04:21:50 CEST 2024

"Dubois-Felsmann, Gregory P." <gpdf at ipac.caltech.edu> writes:

> For those of us who aren't familiar with the specific Rubin SODA
> service, can you clarify whether your refactored service will include an
> async/UWS endpoint?  It's implicit in what you wrote, I think, but it
> wasn't stated directly.  (You did say you'd write a UWS spec.)

Oh, yes, sorry: our SODA service supports both sync and async, with sync
implemented on top of an async UWS job.

This is a bit in the weeds and probably only of interest to FastAPI Python
developers, but the current design that I think I have made work is that
the service provides only:

* A Pydantic model for the job parameters with an alternate constructor
  that takes a list of UWS job parameters that match the semantics of the
  current XML protocol.

* A model (which may be the same or different) for communicating the
  parameters to a backend worker, and a method to do the conversion.

* FastAPI dependencies, one for POST and one for GET, that specify the
  input parameters to the job in the normal FastAPI way (for good OpenAPI
  spec generation) and convert them into lists of UWS parameters.

* A backend worker function that does the actual work of the service,
  however it chooses to do so.

* An implementation of /availability and /capabilities if needed.

The UWS library then provides everything else: route handlers for both
sync and async plus all of the UWS interface, a library to track UWS job
status, all of the glue to dispatch jobs via the queuing library that we
use (arq currently) and collect their results, error handling, input
validation, etc.  The library is very opinionated and therefore might not
be that useful outside of Rubin, since it does some rather strange things
to handle some Rubin-specific issues (such as allowing the backend worker
to run on an entirely different Docker image than all of the other service
components).

(Unfortunately the architecture that we're currently using has the very
serious limitation that it doesn't support stopping or timing out the
backend worker once it's started unless the backend worker is written as
async.  This turned out to be a serious limitation in Python with the
combination of sync code and arq that I wasn't expecting.  I'm not sure if
there is a good Python queuing system that supports an async interface but
can run sync jobs with stop and timeout support.  I have some ideas for
how to solve this problem specifically at Rubin, but they're very
Rubin-centric.)

Work in progress is at <https://github.com/lsst-sqre/vo-cutouts>.  The
work on the model API is not yet complete; that's the last major piece I
need to separate.  The UWS library has not yet been extracted into a
reusable library and can be found in the src/vocutouts/uws directory for
the time being.  Lifting it into a reusable library is the first thing I
will be working on when I get back from vacation.  The UWS library has
some protocol compliance bugs that I still need to fix.

-- 
Russ Allbery (eagle at eyrie.org)             <https://www.eyrie.org/~eagle/>