TAP Implementation Issues: Final Comment: TAP and UWS, sync and async
Paul Harrison
paul.harrison at manchester.ac.uk
Fri Nov 6 10:24:07 PST 2009
Hi,
I think that Tom's implementation idea is legal wrt UWS and not
necessarily "bad" as long as the http://tap/async/id/phase?phase=run
(btw should be a POST) step returns "immediately" - in fact I had been
thinking about offering a similar generic service for people who said
that async TAP was too tricky for them (for Guy - this is what the CEA
"HTTP" style server does). Granted there might be some (internal)
inefficiency with data being transferred from the sync to the async
storage area, and that internally the UWS part might have to resubmit
the sync step if it timed out, but the behaviour to the end client
should appear to be standard UWS still.
I think that the "better" (i.e. more efficient) implementation is to
write a fundamentally async service and layer the sync service on this
as detailed in section 5 of the UWS document, but as long as the UWS
interface is adhered to then people are free to implement as they
want. However, I do not think that this requires a change to the UWS
document to say that UWS is a separate "service" as Tom suggests,
because that would favour the inefficient implementation over the
efficient one.
Paul.
On 2009-11 -06, at 17:47, Guy Rixon wrote:
> Tom,
>
> your suggested implementation has massive, inherent problems: it
> loses most of the benefits of a asynchronous interface!
>
> By depending on a synchronous HTTP endpoint to run the query, your
> implementation breaks any time that synchronous thing times out, and
> it breaks if the network connection with the synchronous thing
> drops. In those cases, your UWS has no way to regain control of the
> query, even if the DB part is still running it. The UWS has to
> resubmit the query.
>
> The whole point of UWS is not to depend on a synchronous HTTP-
> connection for long-running jobs.
>
> You can build synchronous TAP as a wrapper around a UWS (Pat D. has
> done so) but it doesn't work the other way around.
>
> Guy
>
>
> On 6 Nov 2009, at 17:33, Tom McGlynn wrote:
>
>> Hi Guy,
>>
>> I'm not sure there is a big area of disagreement here. In terms of
>> the text that users write and get back in the TAP asynchronous
>> interface I'm not suggesting a that a single byte needs to be
>> changed. It's all in what tasks are doing the processing. I've
>> put in some text that I hope clarifies what I was saying below in
>> context.
>>
>> Tom
>>
>> Guy Rixon wrote:
>>> Tom,
>>> whenever you use UWS in a service definition, you have to say
>>> what parameters it takes when setting up a job and the work done
>>> by that job. That's the "application of the UWS pattern" to use
>>> the terms from the UWS standard.
>>
>> I'm not sure I understand this. While there is talk of JDL and
>> such in the UWS standard, I don't see any requirements that show it
>> actually being used in any way. So while a given UWS
>> implementation might restrict parameters being used, I don't see
>> how that is done within the UWS protocol itself.
>>
>>> The UWS specification is supposed to be reusable between
>>> applications; hence the U in the title. Therefore, it can't
>>> specify the application- specific parameters.
>>>
>>
>> Right. I'm not suggesting that. What I'm saying is that it's easy
>> to write a UWS that can handle any parameters -- as indeed you
>> suggest you have done already below.
>>
>>> It's possible to specify a UWS-conforming service for more than
>>> one application. CEA does this. The modern interface of this kind
>>> is called UWS-PA ("UWS for parameterized applications") and its
>>> fore- runner (which is SOAPy) is the Common Execution Connector.
>>> In these kind of services, the applications are pluggable.
>> Sounds like the kind of thing I was looking at. I suggested in the
>> original message that this has likely come up in earlier discussions.
>>
>>> AstroGrid DSA/Catalogue has had a CEC interface for years. It uses
>>> a generic CEC implementation and passes the requests through to
>>> an ADQL- query application plugged inside it.
>>> The downside of generalizing a job-control service in this way is
>>> complication and divergence from the synchronous case. TAP/UWS is
>>> quite like asynchronous TAP: you do an asynchronous query by
>>> POSTing the same parameters you could use for a synchronous
>>> query. If you try to use CEC or UWS-PA to start a TAP query then
>>> you have a different interface. Because that interface is more
>>> general, it's not as simple, either to implement in a service or
>>> to call from a client.
>> Here's where I think I'm getting a little lost. My suggestion is
>> the that I have a UWS service running above TAP that is simply a
>> proxy for the TAP synchronous service. So by definition, they
>> could not get disassociated. I'm getting the sense that for you,
>> the UWS service needs to know about the parameters it's going to
>> pass along to whatever it calls when it does a run. However, as
>> far as I can see a UWS service can be entirely agnostic about
>> parameters. It can simply take whatever parameters the user
>> specifies and pass it along, leaving it to the underlying
>> synchronous call to handle validity. In fact, for TAP that's
>> pretty much the case since the names of the parameters used in TAP
>> are not bounded.
>>
>>
>>> I think that the current boundary between TAP and UWS is just
>>> where we need it for the simplest implementations.
>>
>> I'm not so much concerned with boundaries as in the sense in which
>> UWS is instantiated. Let me give a concrete example. I have a TAP
>> service with a base URL of http://tap/, so http://tap/sync is the
>> synchronous access point and http://tap/async is the async access
>> point.
>>
>> What happens when someone references the later URL? In my current
>> implementation, a TAP servlet starts up, notes that I'm using an
>> asynchronous request and calls the appropriate methods and classes
>> that TAP has defined for this. If I had multiple asynchronous
>> services these would likely be in a nice little UWS library. All
>> is copacetic: UWS is a layer within TAP. It works fine but TAP and
>> the UWS layer are pretty tightly coupled.
>>
>> What I think I'm going to do when I get back from the IVOA is a bit
>> different. When I invoke http://tap/async I start a servlet whose
>> only knowledge of TAP is that there is a synchronous service at http://tap/sync
>> . It knows nothing of the internals of TAP and is completely
>> independent of it. At some point the user does a http://tap/async/id/phase?phase=run
>> and this UWS service takes the parameters that the user has
>> specified for this job and invokes the http://tap/sync URL with
>> those parameters. The results get saved somehow and whenever the
>> user sends the appropriate URL the results are sent back. The
>> only thing the UWS service ever knows about TAP is the base URL.
>> Everything else is supplied by the user.
>>
>> Why do I like this better? Well it makes the TAP code simpler. It
>> makes it easy for me to provide UWS functionality to all of my web
>> services. E.g., I'd have a UWS interface to SkyView by simply
>> changing the syncrhonous URL. And if UWS changes so that, e.g.,
>> there's now a security resource, I can plug it in without any
>> change whatsoever to my TAP servlet. For me it will be a big win.
>>
>> I'm not suggesting that this implementation be required. It would
>> be fine to keep things coupled in one TAP implementation. However
>> if the paradigm (and here I mean it in its literal sense of
>> exemplar) is a UWS service runs on top of a TAP service then the
>> way to describe the relationship between TAP and UWS changes. In
>> particular I think it then makes a lot more sense to simply say
>> that a UWS service can be used to provide asynchronous access to a
>> TAP service. The standard can require that if we decide async
>> access is mandatory (as I think we have). So the TAP document
>> becomes simpler -- and far less tightly coupled with the UWS
>> document.
>>
>>> Cheers,
>>> Guy
>>> On 6 Nov 2009, at 15:33, Tom McGlynn wrote:
>>>> I'm sure everyone will be happy to see the word 'Final' in the
>>>> title...
>>>>
>>>> In the past couple of days I've gotten the UWS asynchronous
>>>> implementation of TAP working (though doubtless still bug-ridden).
>>>>
>>>> When I read and implemented the TAP and UWS standard I had the
>>>> sense of UWS as being a layer within TAP. In retrospect I think
>>>> it would have been better (for my implementation at least), if I
>>>> had distinguished them more clearly.
>>>>
>>>> Suppose we think of UWS not as an interface layer but as the
>>>> definition of how to build an asynchronous proxies. UWS becomes
>>>> a service definition, not an access protocol. The proxy accepts
>>>> and caches input parameters from the users, starts the
>>>> underlying request when told to, caches the response and sends
>>>> it back to the user when requested. [I haven't followed the
>>>> discussions of UWS earlier in the Grid list, so my apologies if
>>>> I just discovering what everyone already knows....]
>>>>
>>>> If I think of things this way, then I can implement UWS
>>>> completely independently of the underlying application. Indeed
>>>> the binding to the underlying application could be dynamic: I
>>>> can provide a UWS layer over any number of distinct synchronous
>>>> applications. I don't need to know anything about what
>>>> parameters they use, just some root URL. The one piece of the
>>>> specification that might cause problems is the desire to support
>>>> multiple outputs as well as a single result. That's not at
>>>> issue in TAP, but even this could easily be handled by returning
>>>> a list of the outputs -- which is what UWS does now anyway.
>>>>
>>>> UWS is not described this way in its standards document: it is
>>>> shown as a layer within some bigger application, not as a
>>>> separable entity. Similarly TAP shows the asynchronous interface
>>>> tightly coupled within the rest of the TAP.
>>>>
>>>> In this new view, the TAP document would say very little about
>>>> the asynchronous interface. TAP itself would be synchronous,
>>>> but if we want asynchronous access to be mandatory then the
>>>> requirement is that a TAP implementation must specify a
>>>> corresponding UWS service through which the TAP implementation
>>>> can be invoked. We could still have a TAP service that is only
>>>> available asynchronously: we allow that this TAP service is not
>>>> directly callable: Only the associated UWS service can access
>>>> it. I'm not trying to take sides here in the sync/async wars.
>>>>
>>>> Changes to the UWS document would be rather more subtle, noting
>>>> that the interface can implemented without reference to the
>>>> underlying implementation, and perhaps explicitly supporting the
>>>> kind of dynamic association with the underlying synchronous
>>>> service mentioned above. Maybe provide a convenience resource to
>>>> get the output in the single output case (rather than having to
>>>> parse the output list).
>>>>
>>>> The advantage had we taken this approach before, is that it
>>>> largely decouples TAP and UWS. The TAP standard is shorter and
>>>> simpler. The UWS standard is largely unchanged. We can change
>>>> UWS in the future without worrying about any impact on TAP.
>>>>
>>>> This is probably a bridge too far in terms of the TAP standard.
>>>> For UWS it's really a change in tone more than content -- hints
>>>> to the user -- so perhaps it is doable were it to be thought a
>>>> good idea. Regardless, I do anticipate revising my own
>>>> implementation to use this approach after the Interop.
>>>>
>>>> Tom McGlynn
>
Dr. Paul Harrison
JBCA, Manchester University
http://www.manchester.ac.uk/jodrellbank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dal/attachments/20091106/61c50a2d/attachment-0007.html>
More information about the dal
mailing list