TAP RFC [VOSI]

Doug Tody dtody at nrao.edu
Tue Sep 29 12:57:23 PDT 2009


On Tue, 29 Sep 2009, Alberto Micol wrote:
> 
> My point is that a client (TAP, SIA, SSA, etc) cannot know in advance
> if its request is too heavy for a given server. Even more so, if the
> same query is to be sent to many different servers.
> 
> To me, a query is always a SYNC query. If the service cannot answer
> right away, the service will politely inform the client that the
> request will take a bit longer, and will turn to ASYNC.

In the case of TAP ADQL queries it can be especially hard to estimate
the runtime, as Pat illustrates in his posting.  For most other DAL
services there are many cases where the computation is well bounded
and sync can be expected to be ok.  Of course there are harder cases
as well and for these we need async, but for many DAL use-cases sync
is adequate as we see in many current SCS, SSA and SIA services.

When you speak of "asking a question" Alberto, I think that except
in the case of TAP this refers to the queryData, for which sync can
be expected to work fine in virtually all cases (assuming we have
MAXREC or whatever to deal with overflows).  There is no reason the
queryData response could not tell the client that a given data access
will require async execution.  So we take the car in to get it fixed
and ask "how long will it take?"  We get a response right away but
may be told to come back later to pick it up.

In most cases for things like image generation (unlike general ADQL)
it should be possible to estimate the time or cost to compute a
given dataset.  Ideally we want to estimate both the time to compute,
and the size of the dataset.

In the case I mentioned where SIAV1 could respond to an acref-based
getData with a "come back later" (ideally including a specified
interval after which the response becomes invalid), the getData
actually serves as a simple async mechanism.  It would trigger
computation and staging of the specific dataset requested and the
client would poll after the specified "refresh" interval until
the getData URL returned the actual data.  Even once we have UWS
integration it might be useful to retain such a feature.

 	- Doug



More information about the dal mailing list