RESTful Web services and DAL

Doug Tody dtody at nrao.edu
Tue Mar 13 07:59:52 PDT 2007


Hi All -

On Mon, 5 Mar 2007, Paul Harrison wrote:

> On 04.03.2007, at 20:29, Matthew Graham wrote:
> > Exactly and we are going to have to reiterate and reemphasize a lot
> > to colleagues precisely what this discussion is about: REST instead
> > of SOAP for *web* services and not SIAP or similar services.
>
> But actually the problem lies precisely with S*AP services - these
> are the ones that really would benefit from a UWS/CEA model, they
> need the asynchrony.

I agree with this; for SIA V2 (where we plan to introduce support for
asychrony) we should be looking at things like REST vs HTTP vs SOAP,
UWS, and VOSpace integration.  A major use-case for VOSpace and UWS is
integration with the DAL protcols so that we can support long running
service operations, and more flexible data management and transport.

Having finally gotten through all this mail, here is my attempt to sort
out the concepts.

I suggest (based on e.g., the wikipedia and w3c descriptions) that a
*Web service* is any service used by two computers to talk to each other
via Web protocols.  Such a service will define an API which clients use
to talk to the service.  (By this definition SIAP, WMS, etc. represent
a type of Web service).  Individual Web services may or may not be
stateful, but a comprehensive Web service mechanism needs to be able to
support stateful services.

A *SOAP Web service* is a web service which interfaces with clients
via SOAP protocols running over the Web.  The main characteristic of
SOAP is that it defines an XML-based mechanism for doing RPC calls.
The main advantage of SOAP comes when one needs to bind a complex API,
with multiple method calls, into multiple target languages; here the
automatic binding made possible by SOAP and WSDL can be very useful.
However, SOAP is complex and relatively inefficient, and there is more
to distributed computing than low level synchronous RPC calls.  CORBA is
similar, and also makes the mistake of over-emphasizing RPC.

Wikipedia defines a *RESTful Web service* as a Web service which "attempts
to emulate HTTP and similar protocols by constraining the interface to
a set of well-known, standard operations (e.g., GET, PUT, DELETE)".

The OpenGIS community introduces the concept of a "distributed computing
platform (DCP)", used to implement the interface to a service.  HTTP,
SOAP, CORBA, etc., are examples of such platforms.  For a well defined
service, it should be possible, in principle at least, to execute the
same service operations via any such DCP.

A key characteristic of RESTful services appears to be an attempt to
represent resources and state as "file" references, i.e., URLs in the case
of the Web.  In Unix file system terms, /proc is an analogous mechanism.

This would seem to work well for state, but a strict path-oriented
approach (as in http://blabla.edu/cutout/301/22/bombayduck) does not
work well for service operations.  Operations are hard to define with
some sort of parameter or argument mechanism.

Hence we see actual HTTP-based Web services using GET and POST to
implement service operations.  REST suggests that GET should only be
used for idempotent operations (the same operation can be performed
multiple times with the same results, and does not visibly change the
service state).  POST should be used otherwise.

Although an idempotent GET with parameters, used to invoke a service
operation which returns some result, may be mapped to many distinct URLs,
it is still RESTful in many ways.  In particular it is consistent with
the REST model of a fixed URL string which points to some "resource"
and returns a file or document.  The parameter syntax departs from the
simple file path seen in strict REST examples, but this does not appear
to be an important distinction so long as a fixed URL returns consistent
results.

An SSAP queryData operation to a single service for example, is just a
GET which returns a document.  Within a given period of time, the same
GET will return the same document; a number of such GETs may execute
simultaneously on the same service, and each will return the same result.
The same is true for an access reference, and in this case the URL is
generated by the service and is a fixed string, hence the URL-based
access reference is an even better example of a REST-compliant GET.

What we need to look at next is how to handle stateful services.
I haven't had time to look at Guy's example of this for the UWS pattern,
but will do so when I get a chance.  In terms of the SIA concept of the
stageData operation (which triggers the stateful/asynchronous part of the
service), this would presumably mean a POST.  Queries for state could be
done with GET using some session ID; it is not clear to me that this could
not be done using a parameter syntax and still have a RESTful interface.
What may not fit into the REST model well is asynchronous messaging.
Perhaps this is just a GET where we hold the HTTP connection open and
stream messages periodically?

	- Doug



More information about the grid mailing list