TAP Implementation Issues: Final Comment: TAP and UWS, sync and async
Guy Rixon
gtr at ast.cam.ac.uk
Fri Nov 6 09:47:15 PST 2009
Tom,
your suggested implementation has massive, inherent problems: it loses
most of the benefits of a asynchronous interface!
By depending on a synchronous HTTP endpoint to run the query, your
implementation breaks any time that synchronous thing times out, and
it breaks if the network connection with the synchronous thing drops.
In those cases, your UWS has no way to regain control of the query,
even if the DB part is still running it. The UWS has to resubmit the
query.
The whole point of UWS is not to depend on a synchronous HTTP-
connection for long-running jobs.
You can build synchronous TAP as a wrapper around a UWS (Pat D. has
done so) but it doesn't work the other way around.
Guy
On 6 Nov 2009, at 17:33, Tom McGlynn wrote:
> Hi Guy,
>
> I'm not sure there is a big area of disagreement here. In terms of
> the text that users write and get back in the TAP asynchronous
> interface I'm not suggesting a that a single byte needs to be
> changed. It's all in what tasks are doing the processing. I've put
> in some text that I hope clarifies what I was saying below in context.
>
> Tom
>
> Guy Rixon wrote:
>> Tom,
>> whenever you use UWS in a service definition, you have to say what
>> parameters it takes when setting up a job and the work done by
>> that job. That's the "application of the UWS pattern" to use the
>> terms from the UWS standard.
>
> I'm not sure I understand this. While there is talk of JDL and such
> in the UWS standard, I don't see any requirements that show it
> actually being used in any way. So while a given UWS implementation
> might restrict parameters being used, I don't see how that is done
> within the UWS protocol itself.
>
>> The UWS specification is supposed to be reusable between
>> applications; hence the U in the title. Therefore, it can't
>> specify the application- specific parameters.
>>
>
> Right. I'm not suggesting that. What I'm saying is that it's easy
> to write a UWS that can handle any parameters -- as indeed you
> suggest you have done already below.
>
>> It's possible to specify a UWS-conforming service for more than
>> one application. CEA does this. The modern interface of this kind
>> is called UWS-PA ("UWS for parameterized applications") and its
>> fore- runner (which is SOAPy) is the Common Execution Connector. In
>> these kind of services, the applications are pluggable.
> Sounds like the kind of thing I was looking at. I suggested in the
> original message that this has likely come up in earlier discussions.
>
>> AstroGrid DSA/Catalogue has had a CEC interface for years. It uses
>> a generic CEC implementation and passes the requests through to an
>> ADQL- query application plugged inside it.
>> The downside of generalizing a job-control service in this way is
>> complication and divergence from the synchronous case. TAP/UWS is
>> quite like asynchronous TAP: you do an asynchronous query by
>> POSTing the same parameters you could use for a synchronous query.
>> If you try to use CEC or UWS-PA to start a TAP query then you have
>> a different interface. Because that interface is more general,
>> it's not as simple, either to implement in a service or to call
>> from a client.
> Here's where I think I'm getting a little lost. My suggestion is
> the that I have a UWS service running above TAP that is simply a
> proxy for the TAP synchronous service. So by definition, they could
> not get disassociated. I'm getting the sense that for you, the UWS
> service needs to know about the parameters it's going to pass along
> to whatever it calls when it does a run. However, as far as I can
> see a UWS service can be entirely agnostic about parameters. It can
> simply take whatever parameters the user specifies and pass it
> along, leaving it to the underlying synchronous call to handle
> validity. In fact, for TAP that's pretty much the case since the
> names of the parameters used in TAP are not bounded.
>
>
>> I think that the current boundary between TAP and UWS is just where
>> we need it for the simplest implementations.
>
> I'm not so much concerned with boundaries as in the sense in which
> UWS is instantiated. Let me give a concrete example. I have a TAP
> service with a base URL of http://tap/, so http://tap/sync is the
> synchronous access point and http://tap/async is the async access
> point.
>
> What happens when someone references the later URL? In my current
> implementation, a TAP servlet starts up, notes that I'm using an
> asynchronous request and calls the appropriate methods and classes
> that TAP has defined for this. If I had multiple asynchronous
> services these would likely be in a nice little UWS library. All is
> copacetic: UWS is a layer within TAP. It works fine but TAP and the
> UWS layer are pretty tightly coupled.
>
> What I think I'm going to do when I get back from the IVOA is a bit
> different. When I invoke http://tap/async I start a servlet whose
> only knowledge of TAP is that there is a synchronous service at http://tap/sync
> . It knows nothing of the internals of TAP and is completely
> independent of it. At some point the user does a http://tap/async/id/phase?phase=run
> and this UWS service takes the parameters that the user has
> specified for this job and invokes the http://tap/sync URL with
> those parameters. The results get saved somehow and whenever the
> user sends the appropriate URL the results are sent back. The only
> thing the UWS service ever knows about TAP is the base URL.
> Everything else is supplied by the user.
>
> Why do I like this better? Well it makes the TAP code simpler. It
> makes it easy for me to provide UWS functionality to all of my web
> services. E.g., I'd have a UWS interface to SkyView by simply
> changing the syncrhonous URL. And if UWS changes so that, e.g.,
> there's now a security resource, I can plug it in without any change
> whatsoever to my TAP servlet. For me it will be a big win.
>
> I'm not suggesting that this implementation be required. It would
> be fine to keep things coupled in one TAP implementation. However
> if the paradigm (and here I mean it in its literal sense of
> exemplar) is a UWS service runs on top of a TAP service then the way
> to describe the relationship between TAP and UWS changes. In
> particular I think it then makes a lot more sense to simply say that
> a UWS service can be used to provide asynchronous access to a TAP
> service. The standard can require that if we decide async access
> is mandatory (as I think we have). So the TAP document becomes
> simpler -- and far less tightly coupled with the UWS document.
>
>> Cheers,
>> Guy
>> On 6 Nov 2009, at 15:33, Tom McGlynn wrote:
>>> I'm sure everyone will be happy to see the word 'Final' in the
>>> title...
>>>
>>> In the past couple of days I've gotten the UWS asynchronous
>>> implementation of TAP working (though doubtless still bug-ridden).
>>>
>>> When I read and implemented the TAP and UWS standard I had the
>>> sense of UWS as being a layer within TAP. In retrospect I think
>>> it would have been better (for my implementation at least), if I
>>> had distinguished them more clearly.
>>>
>>> Suppose we think of UWS not as an interface layer but as the
>>> definition of how to build an asynchronous proxies. UWS becomes
>>> a service definition, not an access protocol. The proxy accepts
>>> and caches input parameters from the users, starts the
>>> underlying request when told to, caches the response and sends it
>>> back to the user when requested. [I haven't followed the
>>> discussions of UWS earlier in the Grid list, so my apologies if I
>>> just discovering what everyone already knows....]
>>>
>>> If I think of things this way, then I can implement UWS
>>> completely independently of the underlying application. Indeed
>>> the binding to the underlying application could be dynamic: I can
>>> provide a UWS layer over any number of distinct synchronous
>>> applications. I don't need to know anything about what
>>> parameters they use, just some root URL. The one piece of the
>>> specification that might cause problems is the desire to support
>>> multiple outputs as well as a single result. That's not at issue
>>> in TAP, but even this could easily be handled by returning a list
>>> of the outputs -- which is what UWS does now anyway.
>>>
>>> UWS is not described this way in its standards document: it is
>>> shown as a layer within some bigger application, not as a
>>> separable entity. Similarly TAP shows the asynchronous interface
>>> tightly coupled within the rest of the TAP.
>>>
>>> In this new view, the TAP document would say very little about
>>> the asynchronous interface. TAP itself would be synchronous, but
>>> if we want asynchronous access to be mandatory then the
>>> requirement is that a TAP implementation must specify a
>>> corresponding UWS service through which the TAP implementation
>>> can be invoked. We could still have a TAP service that is only
>>> available asynchronously: we allow that this TAP service is not
>>> directly callable: Only the associated UWS service can access
>>> it. I'm not trying to take sides here in the sync/async wars.
>>>
>>> Changes to the UWS document would be rather more subtle, noting
>>> that the interface can implemented without reference to the
>>> underlying implementation, and perhaps explicitly supporting the
>>> kind of dynamic association with the underlying synchronous
>>> service mentioned above. Maybe provide a convenience resource to
>>> get the output in the single output case (rather than having to
>>> parse the output list).
>>>
>>> The advantage had we taken this approach before, is that it
>>> largely decouples TAP and UWS. The TAP standard is shorter and
>>> simpler. The UWS standard is largely unchanged. We can change
>>> UWS in the future without worrying about any impact on TAP.
>>>
>>> This is probably a bridge too far in terms of the TAP standard.
>>> For UWS it's really a change in tone more than content -- hints
>>> to the user -- so perhaps it is doable were it to be thought a
>>> good idea. Regardless, I do anticipate revising my own
>>> implementation to use this approach after the Interop.
>>>
>>> Tom McGlynn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/grid/attachments/20091106/88a543e8/attachment-0007.html>
More information about the grid
mailing list