UWS as a REST protocol
Norman Gray
norman at astro.gla.ac.uk
Tue Mar 6 14:27:53 PST 2007
Greetings,
On 2007 Feb 26 , at 15.43, Guy Rixon wrote:
> there has been a lot of debate recently in the industry about SOAP
> vs. REST as
> the basis of web services. I've written an IVOA Note on how UWS
> might be
> presented as a REST service: please see
I'm catching up on this discussion a little late, so please forgive
the compendium response. I've tried not to overlap too much with the
already posted replies to points, with I think varying success.
I've been pushing REST for a little while, and have written a couple
of RESTful applications[1], so what's below is a mixture of reading
and personal experience.
Matthew said:
> The main point of REST is having resources which can be accessed
> through a standard CRUD interface that maps onto the HTTP methods:
>
> CREATE = POST
> RETRIEVE = GET
> UPDATE = PUT
> DELETE = DELETE
I would tend to think of PUT as create/replace (RFC 2616: `the PUT
method requests that the enclosed entity be stored under the supplied
Request-URI'), and POST as update, since it to some extent mutates or
(RFC 2616 again) `creates a new subordinate' of the Request-URI.
That said, I hadn't thought of this mapping between REST and CRUD,
but that's really useful!
Matthew again, responding to John:
> You're right about the HTTP Accept header but this is difficult to
> set from a browser.
While it's true that you can't control Accept headers from a (HTML)
browser, would you want to? What a browser can handle is text/html
or at least text/* -- any other representations of a resource would
naturally be handled by different types of client.
Indeed, perhaps the canonical REST client is 'curl', not Mozilla.
Guy (but overlapping points in Dave Morris's previous message):
> However, if the identifier for a VOSpace node is
> http://mumble.mumble/... then it's tided to some specific site-
> name. The
> vos:// notation lets us move the node transparently to a different
> host (or,
> at least it lets us move the _space_; maybe not individual nodes);
True, but the mumble.mumble host needn't have much to do with where
the bitbucket of the resource is. HTTP defines a range of 3xx
responses, and if the mumble.mumble server's job is to return one of
those responses, then it's not architecturally hugely different from
a registry, except with an easy-to-use lookup protocol. Indeed, when
an HTTP server returns a redirection like this, I imagine it'd be
defensible for the Location to use a protocol other than 'http',
perhaps one such as 'gridftp' (this, by the way, seems to at least
look towards addressing the VOSpace requirements 0 and 1 in a message
from Paul late in the thread).
> we can't do this with http:// URIs except by DNS trickery
One person's `trickery' is another person's `full use of the protocol
stack'. I'd imagine that with a combination of DNS round-robining
and 300 (Multiple Choices) you could create a very flexible
distributed system without making anyone too queasy. But the proof
would be in the implementation.
To build on Dave's example of an asynchronous third-party transfer,
consider the following sequence of actions:
1. POST to http://orchestrator.org/transfer some suitable payload
specifying the transfer (with a suitable Content-Type); the response
is 303 with Location:http://orchestrator.org/transfer/<id>/status.
The payload you could think of as a message, but you could also view
it as a description of the transfer you want, or the result you want
achieved (nouns rather than verbs, again).
2. GET http://orchestrator.org/transfer/<id>/status; response is 503
(Service Unavailable) with a Retry-After: containing the server's
estimate of when it's worth asking again
3. Wait and retry, until the status is 200, 204, or perhaps 502 (Bad
Gateway = `it's not my fault').
4. If you get bored waiting, then DELETE http://orchestrator.org/
transfer/<id>/status.
...or something like that. You can probably do most or all of this
transaction using 'curl' (I think, but wouldn't swear, that it can
grok Retry headers).
Roy:
> What is the theological basis of RESTfulness?
The claim is that it's a better impedance match to the actual web,
and that it's worked so far (it just didn't have a fancy name).
Compare HTTP and CORBA: which protocol's endpoint identifiers do you
see written on the sides of busses?
> Why is it better than VERBishness?
I don't think there's a one-line answer to that. I think, however,
that most answers would be elaborations of `the web is a messy
heterogeneous place, and needs a small(ish) universal protocol, which
HTTP hits the sweet-spot of'. See the previous answer.
> Why is is bad for a URL to have side-effects?
It's not bad for `a URL' to have side-effects -- indeed the PUT and
DELETE actions on the URL necessarily have side-effects, and POST may
have. The point is that if you define GETting a URL to have no side
effects (or at least none that the requestor is accountable for),
then you can reason about the properties of proxies, caches,
security, and so on, and so make the web work. Optimisations are a
function of the strength of the assertions you can make about a system.
> Why can't VERBish things be cached just as well?
It's not just about caching. The REST thing is not just motiveless
`URLs should be nouns not verbs'. My summary is:
* URIs name things (possibly abstract things like the weather in
Glasgow, or relatively concrete things like the status of a
transfer). As such, they can be passed around straightforwardly on
busses (diesel or memory) with less chance of everyone getting
confused. You can't do this so easily with a verb/message: whom can
I send this message to?, when?, can I replay it?, are _you_ allowed
to?, can I store/duplicate/discard it? A name's just a name.
* There are a _few_ CRUD things you can do with a name (GET/PUT/...),
and they are orthogonal to the representations of the thing (Content-
Type, Accept), and orthogonal to the name itself. If you've chosen
your set of names skillfully (not trivial, of course), then you can
probably use those few methods to do all that you want with the
names. Since HTTP will change on a vastly slower timescale than your
SOAP protocol spec, that's a vast amount of confusion and brittleness
you've taken out of the world.
* HTTP is a damn clever protocol, when you look at it closely. And
RFC 2616 is possibly larger than you think, with 'curl' implementing
quite a lot of it.
Paul:
> In the pure RESTful alternative what if the underlying data change
> (e.g. improved calibration) then really by the RESTful theology the
> response should still return the original data not the improved data?
No, the name is the same, but its state, and thus the representation
of that state, is now different (better calibrated), and not
necessarily as the result of a PUT. You can have a URL which names
`the current weather in Glasgow' -- that's going to change on a
minute-by-minute basis. It's the Expires, and If-Modified-Since
class of headers that handle the fact that while names might well be
long-lived, representations come and go, and might need to be re-
retrieved on any timescale between minutes and years.
> Maybe what we want is a mixture.....REST for the stateful, job
> management part of services, and SOAP (or REST HTTP GET with URL
> parameters) for the initial job creation - most astronomers do
> simply want to think of the action that they want to perform in a
> procedural fashion.
...and if the named Thing is a `processing element', then PUTting a
job onto it (creating it) and GETting the results back sound pretty
procedural.
In my experience, the hard thing about designing a RESTful service,
is deciding just what is the set of Things that you're going to name
and therefore expose as the conceptual state of your service.
This isn't massively different from the work you do deciding on
classes and methods in an O-O design, but because (in effect) the
method names are chosen for you, and because the set of names is (in
effect) your API, it becomes a weightier design decision. The upside
is that this forces you to ask yourself some very useful questions
about what it is you're designing, and pushes you towards a design
that's simple and powerful.
One of the stronger arguments in Fielding's thesis (he who named the
`REST' notion) is the discussion of Unix pipes and character
streams. Representing everything as a character stream, and thus
having successive tools parsing and re-serialising, seems clumsy when
you first look at it, but the discipline of fitting in with that
pattern pushes you towards designing Unix tools in a way which
encourages orthogonal, simple, and robust components, which is
powerful because it matches the ecology/environment you find in
Unix. That is, pipes and streams go with the flow in Unix, in the
way that (say) more heavyweight file objects chimed with VMS, and (it
is plausibly claimed) the way that HTTP simply chimes with the web.
All the best,
Norman
[1] Currently, temporarily, at <http://thor.roe.ac.uk/quaestor/> and
<http://thor.roe.ac.uk/utype-resolver/>. The first of the two
implements a generic reasoner, which allows you to upload RDF, and
submit elaborate queries against it, retrieving the results in
multiple formats. I wouldn't swear to it being canonical REST style,
but it's surely non-trivial, and it seems to work OK.
--
------------------------------------------------------------------------
----
Norman Gray / http://nxg.me.uk
eurovotech.org / University of Leicester, UK
More information about the grid
mailing list