Thoughts on standardizing TAP timeout/error handling
Theresa Dower
dower at stsci.edu
Thu Jul 24 21:00:31 CEST 2025
We at MAST have *just* run into a question of how to note timeout errors from various server- and client- side limits; I wouldn’t mind trying this out as we adjust relevant behavior in our TAP service.
Our context is similar to Rubin’s: we’re just now having to set a longer default timeout for async queries than sync ones, and support user negotiation of async timeouts up to a higher limit using executionDuration. We’re still actively deciding on a good upper limit based on experience with TAP and CASJOBS, but given most queries we get work with shorter limits in sync mode, the preference is for folks to try the simpler sync mode workflow exercising fewer pieces of our infrastructure first. Being able to provide more information when that doesn’t work would be nice.
Cheers,
--Tessa
From: dal <dal-bounces at ivoa.net> on behalf of Tamara Civera via dal <dal at ivoa.net>
Reply-To: Tamara Civera <tcivera at cefca.es>
Date: Thursday, July 24, 2025 at 7:24 AM
To: "dal at ivoa.net" <dal at ivoa.net>
Subject: Re: Thoughts on standardizing TAP timeout/error handling
External Email - Use Caution
Hi everyone,
I find this topic very interesting and relevant, especially from an operational perspective. I also think the idea of using a controlled hierarchical vocabulary for "ERROR_TYPE" is a great approach. From my experience, I believe it is relatively simple to implement in TAP services, at least in those we manage at CEFCA/OAJ.
Furthermore, if you need any collaboration from the Operations Group to help define it or for any other matter, we would be happy to collaborate in any way we can.
Best regards,
Tamara
El 17/7/25 a las 11:29, Mark Taylor via dal escribió:
Stelios,
as I said elsewhere I think this looks reasonable, and I agree with
the idea of trying it out in some services and clients prior to
actual standardisation. I'd be happy to prototype some client code
in topcat if some service implementation out there starts emitting
error details like this.
Mark
On Wed, 16 Jul 2025, Stelios Voutsinas via dal wrote:
Hi everyone,
I wanted to bring up an issue we've been discussing over in the pyvo repo (astropy/pyvo#686) about TAP clients not being able to reliably distinguish query timeouts from other failures.
The basic motivation and problem in question here is that different services handle timeouts in different ways, some return specific HTTP codes, others just drop connections, and it does not currently seem possible for clients to distinguish timeouts from the error message.
This makes it really hard for clients to know what actually went wrong and whether they should suggest something useful to the user (like "hey try async mode instead").
An initial thought was to investigate if there are any appropriate HTTP codes like 408, but as Mark Taylor pointed out, these aren't appropriate for processing timeouts, and other than that there doesn't seem to be a good standard HTTP code for "query took too long to process." So we've been thinking about other approaches.
An idea that was brought up is extending how we handle errors in DALI, but in a way that doesn't break existing clients. The idea is to keep QUERY_STATUS simple (still just ERROR, OK, OVERFLOW) but add some structured info alongside it.
Something like this:
<VOTABLE>
<RESOURCE type="results">
<INFO name="QUERY_STATUS" value="ERROR">Query exceeded processing time limit</INFO>
<INFO name="ERROR_TYPE" value="resource-limit"/>
<INFO name="ERROR_SUBTYPE" value="time-limit-exceeded"/>
</RESOURCE>
</VOTABLE>
The nice thing about this is that old clients just see ERROR and handle it however they already do, but newer clients can check the ERROR_TYPE and do something smarter like suggesting async for timeouts, or asking the user to reduce their result size for storage limits.
I'm thinking we could have error types like:
- resource-limit (with subtypes like time-limit-exceeded, storage-limit-exceeded)
- authorization (authentication-expired, insufficient-permissions)
- query-syntax (invalid-adql, table-not-found, etc.)
We could even include some additional context when it's helpful:
<INFO name="ERROR_TYPE" value="resource-limit"/>
<INFO name="ERROR_SUBTYPE" value="time-limit-exceeded"/>
<INFO name="SYNC_LIMIT_SECONDS" value="30"/>
<INFO name="ASYNC_LIMIT_SECONDS" value="3600"/>
Though I can see the argument that this descriptive metadata might be over-complicating things, and this depends on whether clients would actually use it.
I think the main thing is agreeing on the error types and subtypes first.
This would give clients enough info to make better decisions about retries or remedy recommendation to the user without being too prescriptive about what they should do.
Markus suggested this might be a good candidate for a standard vocabulary, which makes sense to me. And the approach seems extensible enough that we could add new error types later without breaking anything.
I think this could be something we try out in a few implementations (DaCHS, Rubin, pyvo, maybe TOPCAT) and see how it works in practice before making it official in DALI.
What do you all think? Does this seem like a reasonable direction? Are there other error scenarios we should be thinking about? Any obvious problems with this approach?
The timeout issue is relevant for Rubin because we've decided that we have different timeouts for async vs sync queries and want to push users towards using async, so this would be one step in the direction of making this easier.
So it would be great to find a way forward on this if this seems reasonable. (The full discussion is in the GitHub issue if anyone wants more context on this)
Thanks,
Stelios Voutsinas
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk<mailto:m.b.taylor at bristol.ac.uk> https://urldefense.com/v3/__https://www.star.bristol.ac.uk/mbt/__;!!D9dNQwwGXtA!V5TAr7F-mH3RcTdXYEBXmWZL8oPTJllt4A00AcaN7fC0_voWDJUxT1VAL6D6mehRmWodtHGF6w$<https://urldefense.com/v3/__https:/www.star.bristol.ac.uk/mbt/__;!!D9dNQwwGXtA!V5TAr7F-mH3RcTdXYEBXmWZL8oPTJllt4A00AcaN7fC0_voWDJUxT1VAL6D6mehRmWodtHGF6w$>
--
Tamara Civera Lorenzo
Scientific Database Engineer
[Logotipo CEFCA]
Centro de Estudios de Física del Cosmos de Aragón
* Plaza San Juan nº 1, piso 2º
* 44001 Teruel
* Tfno: +34978221266 - Ext: 1116
* Fax: +34978602334
* Email: tcivera at cefca.es<mailto:tcivera at cefca.es>
* Url: http://www.cefca.es<https://urldefense.com/v3/__http:/www.cefca.es__;!!CrWY41Z8OgsX0i-WU-0LuAcUu2o!xHAXiwWo5d7dzpA89VGP8en34NZgdIX5ssUn3OjQQmWDS-UrJmqSaIjvxo6YHrLMrkMpWx3_$>
Este correo electrónico y, en su caso, cualquier archivo adjunto al mismo, contiene información de carácter confidencial exclusivamente dirigida a su destinatario. Queda prohibida su divulgación, copia o distribución a terceros sin la previa autorización escrita de nuestra entidad. En el caso de haber recibido este correo electrónico por error, por favor, elimínelo de inmediato y notifique esta circunstancia al remitente. En conformidad con lo establecido en las normativas vigentes de Protección de Datos a nivel nacional y europeo, le informamos que en la Fundación CEFCA tenemos un interés legítimo para tratar sus datos de contacto con la finalidad de informarle de nuestras actividades y gestionar sus peticiones. Sus datos no serán cedidos a terceros, y los conservaremos sólo mientras sea necesario para el mantenimiento de nuestra relación o la gestión de su solicitud. Puede ejercer sus derechos de acceso, rectificación y supresión de sus datos, así como los derechos de limitación y oposición a su tratamiento con fines promocionales, escribiendo a rgpd at cefca.es<mailto:rgpd at cefca.es>. También puede dirigirse a nuestro Delegado de Protección de Datos para cualquier consulta relacionada con nuestra política de privacidad, escribiendo a dpd at cefca.es<mailto:dpd at cefca.es> .
________________________________
This email and, when appropriate, any file attached to it, contains confidential information exclusively addressed to its recipient. It is forbidden the disclosure, copying or distribution of this email to third parties without the prior written authorization of our entity. Should you have received this email by mistake, please delete it immediately and notify the sender of this circumstance. According to the provisions of current Data Protection regulations at national and European level, we inform you that at CEFCA we have a legitimate interest to process your contact data in order to inform you of our activities and manage your requests. Your data will not be transferred to third parties, and we will keep it only as long as it is necessary for the maintenance of our relationship or the management of your request. You can exercise your rights of access, rectification and deletion of your data, as well as the rights of limitation and opposition to its treatment for promotional purposes, by writing to rgpd at cefca.es<mailto:rgpd at cefca.es>. You can also contact our Data Protection Officer for any query related to our privacy policy, writing to dpd at cefca.es<mailto:dpd at cefca.es> .
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20250724/ca4c4d60/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 3471 bytes
Desc: image001.png
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20250724/ca4c4d60/attachment-0001.png>
More information about the dal
mailing list