TAP RFC [VOSI]
Patrick Dowler
patrick.dowler at nrc-cnrc.gc.ca
Tue Sep 29 10:19:19 PDT 2009
On Tuesday 29 September 2009 08:00:49 Alberto Micol wrote:
> My point is that a client (TAP, SIA, SSA, etc) cannot know in advance
> if its request
> is too heavy for a given server. Even more so, if the same query is to
> be sent to many different servers.
You are forgetting that the service also cannot in general know that the query
is a heavy, time-consuming request that will exceed the http timeouts of using
the sync endpoint.
select * from someTable
where INTERSECTS(spatial_bounds,circle('ICRS', 10,10,0.1) = 1
This is a typical spatial query (cone search) in ADQL. If the table is small,
it will probably be fast. If the table has a spatial indexing scheme on the
spatial_bounds column, it will probably be faster than if it does not. If the
content is spread out and the actual condition is very selective, it will be
faster than if all the content is inside the circle.... can anyone really
plausibly determine ahead of time that this will be fast? probably fast?
probably slow? slow? Not plausibly, in my opinion. I can look at a query and
make a good guess about whether it wil be heavy or not, but I cannot write
software to make that guess for me :-)
> To me, a query is always a SYNC query. If the service cannot answer
> right away, the
> service will politely inform the client that the request will take a
> bit longer,
> and will turn to ASYNC.
In reality, users will try to do a query using sync and if it fails they can
either change the query or use async instead. If the user thought the query
was simple and fast they will likely examine it more closely for bugs. If they
know it is complex, they will maybe assume it is correct and try async, or
they may set MAXREC to something small and try sync again to test it. I don't
think the service can really make these decisions.
More immediately, that is not the UWS model and in TAP we are meshing UWS and
DAL sync access in a single service. In future we could think about how a sync
request could decide to redirect the caller to an equivalent async job and
what the impact on clients would be... how about in the next version? :-)
>have to stage the result to disk
yes, it is inherent in async that one has to provide server side resources
(not necessarily files on disk), at least temporarily but that last longer than
a single http request. The upside of this is that when the result is
transferred over the network, one should know the content-length and thus be
able to support resumable downloads in case of small network issues. With
streaming output directly from the db, one has to run the query again and
start the transfer from scratch. With a poor network connection, the user will
never be able to succeed.
--
Patrick Dowler
Tel/Tél: (250) 363-0044
Canadian Astronomy Data Centre
National Research Council Canada
5071 West Saanich Road
Victoria, BC V9E 2M7
Centre canadien de donnees astronomiques
Conseil national de recherches Canada
5071, chemin West Saanich
Victoria (C.-B.) V9E 2M7
More information about the dal
mailing list