TAP 1.0/1.1 inconsistency: error document format

Fri Oct 10 21:38:40 CEST 2025

I'm bringing a SPHEREx perspective today, and a perspective specific to building asynchronous UWS services that perform complex and long-running operations.  For SPHEREx, currently that means creating on-the-fly cutouts, and performing spectral extractions on the data via forced photometry.  The spectrophotometry service exists today (as a soft launch, currently only used internally by the IRSA web application for SPHEREx a/k/a Firefly, but intended for eventual public use, and we are in middle of implementing the cutout/mosaic service.

I realize that the ostensible focus of this thread is TAP, but there are plenty of more generic mentions of DALI up-thread, and I do very much care about building DALI-flavored services wherever possible even when the actual service is mission/project-specific, with no available standard to cover its specific details.

In this context (async UWS) the means for reflecting errors back to the user are:
* The UWS job going to a PHASE = ERROR state
* Providing a <uws:errorSummary> element in the <job> object for the task, including:
  + an ErrorType ("transient" or "fatal")
  + a (presumably) short string description of the error via the <uws:message> element
  + a hasDetail indicator, with "true" indicating that there is additional information at the /error endpoint
* Providing details at the /error endpoint

Note that for one of our services the normal /result is a FITS file, and for the other the normal result is a VOTable (with VOParquet as a future option).

When there is an error in these services, it's typically either a parameter constraint violation, for which a short message is sufficient response, or something much more serious, and that often involves a stack trace or other verbose logging.

This is the sort of information we want to pass back as the /error object, then, so that the user has something that they can either interpret themselves, or go to a help desk with and ask for assistance.  Constructing a FITS or VOTable container for this information does not seem to make sense.

UWS seems to overtly support this:

Quoting from UWS 2.1.7:
> The error object gives a human-readable error message for the underlying job. This object is intended to be a detailed error message, and consequently might be a large piece of text such as a stack trace. When there is an error running a job, a summary of the error should also be given using the optional errorSummary element of the Job element.

and from UWS 2.2.2.5:
> When an error occurs in a job the UWS must signal this at a minimum by setting the PHASE to error. In addition the <errorSummary> element, giving a brief summary of the error, should be included within the <job> element. The UWS may include a more detailed error message for example an execution log or stack trace by providing such a resource at the /{jobs}/{job-id}/error URI. It is the responsibility of the implementing service to specify the form that such an error message may take.

Does the IVOA want to define a standard for the inclusion of, for example, stack traces in VOTables?  Because without that, trying to comply with the proposed stringencies seems like it will do more harm than good.

(In the context of TAP I think we can live with what's been discussed, but I'm really concerned about pushing it upstream toward DALI.)

Gregory

________________________________________
From: dal <dal-bounces at ivoa.net> on behalf of Gregory MANTELET via dal <dal at ivoa.net>
Sent: Friday, October 10, 2025 06:11
To: Mark Taylor; Patrick Dowler
Cc: dal at ivoa.net
Subject: Re: TAP 1.0/1.1 inconsistency: error document format

Mark,

I prefer your version because of the change from "may" to "must".

However, I'd prefer something saying that VOTable is currently the only
standard format defined by the IVOA in order to convey errors. It must always
be supported. So, if no format is explicitly requested or if the asked one is not
supported, a VOTable format should be returned by default. On the contrary,
if the one asked by the user is supported by the service, the error may be
returned in this format. This way, a generic client knows that it can rely on
a unique format (whatever happen): VOTable, in this case.

That could maybe be written as such in DALI:

"An error document describing errors in use of the DAL
service protocol must be a VOTable document. If supported,
other formats may be used when explicitly asked by a client.
If a format is not supported, the service must fallback to
the VOTable format."

I do not really know how to write that in a correct english way, but I hope you
get my idea. Then, is it a good idea? Is it to strict? I don't know.

Anyway, what I do not like with saying VOTable or plain text is that it is too
restrictive. Besides, there are probably more than one way to write an error in
a plain text document (e.g. just the raw error message, a message + an error
code, a message + more details, ...). At least, if we mention the plain text
document, we should probably say that it is limited just to the raw error
message ; no structure or other kind of information is expected.

Moreover, explicitly giving another error format will probably lead to
a future where one would ask to update this sentence in order to add
another format (e.g. JSON), and then another one. If the new formats are
standardized by the IVOA, it is fine, but otherwise, I think we should probably
avoid this.

Cheers,
Grégory

On 10/10/2025 09:56, Mark Taylor wrote:

Pat,

I generally agree with that.  So a fix in DALI could look like
changing text in the DALI 1.2 current WD section 5.2 from:

   "An error document describing errors in use of the DAL service
    protocol may be a VOTable document [ref] or a plain text document."

to

   "An error document describing errors in use of the DAL service
    protocol must be a VOTable document [ref] if the client is
    expecting a VOTable response; in other cases it should usually
    be a plain text document."

The phrase before the semicolon would supply what I've been
complaining about here.  The second part could be written more
prescriptively, but I've put "should usually" instead of "must"
to leave open the possibility of JSON or whatever that somebody
mentioned, I don't know if that's really a good idea or should
be tightened up.

Although I agree that text/plain is generally easier for clients,
what's not easier is having to examine the content-type and adjust
the parsing on the basis of that - you want to know what to expect.

Relevant: since I brought this up the Rubin service has now been
changed to work as expected (error document is VOTable not plain text),
which I believe is how all other services do it, so in practice this
isn't so much of a problem for me (I'm probably not going to make
code changes unless it comes up again).

Mark

On Thu, 9 Oct 2025, Patrick Dowler wrote:

In my view the TAP-1.1 text is about as good/correct as possible;
TAP-1.0 was too strict given that we allowed other output formats. The
main problem is that DALI-1.1 is too vague, but I'd also hate for this
to become too complicated.

essentially:
if request is VOTable: error response should be VOTable
if request is csv or tsv: error response in text/plain should be fine
if request is parquet (aka VOParquet):  error response is ???

We currently use text/plain in the last case (prototype) and I don't
see any other choice as viable... to be clear, I think even this is
too much to write in DALI because other services are working with say
xml and json serialisation formats.

I don't want to go so far as to say it was a mistake to ever put error
messages into VOTable, but to be honest text/plain is almost always
easier and better for the client...

So the fix is nominally in DALI, TAP-1.2 could make a more careful
statement aligning with DALI... if we can agree on it we can get it
into 1.2

--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada

On Thu, 9 Oct 2025 at 07:47, Grégory Mantelet
<gregory.mantelet at astro.unistra.fr><mailto:gregory.mantelet at astro.unistra.fr> wrote:

Hi Pat, Mark,

As web client developer, I indeed appreciate when I can have something else than a XML
document to parse in case of query error. I already have the possibility to get some errors in
JSON, and apparently Rubin already accept text errors.

On the other hand, I'd also understand if a service cannot supply the error in multiple formats.
It is not always easy or possible to do. Because of this, I agree with the approach stating
that the error documents should be provided, as much as possible, in the requested format.
Then, we should probably say that the fallback standard format must be VOTable. This way,
it is a safe format that a generic client can rely on. So, in a word, I agree.

Anyway, I am a bit lost. Where do you plan to apply a fix?
DALI, TAP, UWS?
If DALI, is it expected in the upcoming DALI-1.2?

Cheers,
Grégory

On 17/09/2025 18:12, Mark Taylor via dal wrote:

The impossible-to-provide-VOTable point doesn't apply in this context.
TAP 1.1 sec 3.3 does talk about errors in the use of the HTTP protocol,
which would lead to non-VOTable error responses, but the case I'm
talking about here is an error document retrieved from the UWS
error document endpoint /async/<jobid>/error:

  "Error documents should be in a format that matches the requested format
   where possible; see DALI for details. If the error document is being
   retrieved from the /async/<jobid>/error resource (specified by UWS)
   after an asynchronous query, the HTTP status code should be 200."

So where a VOTable result is implicitly or explicitly requested
it is possible to supply a VOTable response; the TAP 1.0->1.1
inconsistency is the "should" which used to be a "must".

On Wed, 17 Sep 2025, Patrick Dowler wrote:

At this point it has been written/advertised that way for years, so we
should consider whether it is something to fix or just something to
accept and adapt to... I know it is not always possible to adhere to
the strict "error must be VOTable" because some errors happen before
handling the request gets anywhere near the code that can feasibly
write a VOTable.
--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada

On Wed, 17 Sept 2025 at 02:51, Mark Taylor <m.b.taylor at bristol.ac.uk><mailto:m.b.taylor at bristol.ac.uk> wrote:

It might be the spirit, but between TAP 1.1 and DALI 1.1 the text does
seem to allow a plain text error document, which is what Rubin are
currently doing, though I haven't encountered it in other services.

So I think that means that clients and validators have to be prepared
for that (topcat and taplint currently are not; I don't know about
other TAP clients).

Unless we consider this Erratum fodder?

On Mon, 15 Sep 2025, Patrick Dowler wrote:

iirc, it was deliberate to change from "must be VOTable" to be able to
play nice with other formats, so it probably should have been as close
to "must be in the requested format" as possible. But of course, not
all tabular output formats can carry error messages, so it gets kind
of tricky in the details.

The spirit is still definitely that if the client expects a VOTable
response they get that for both success and error, but the DALI quote
shown is much too vague... I think that is where the
clarification/detail is needed.

--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada

On Mon, 15 Sept 2025 at 06:41, Mark Taylor via dal <dal at ivoa.net><mailto:dal at ivoa.net> wrote:

Hi DAL.

Following a taplint query by Stelios, I've just noticed the following
inconsistency between TAP 1.0 and TAP 1.1 as regards the format of
UWS error documents resulting from failed async TAP queries:

   TAP 1.0 Sec 2.7.3:
      "Error documents for TAP errors must be VOTable documents;
       any result-format specified in the request is ignored."

   TAP 1.1 Sec 3.3:
      "Error documents should be in a format that matches the requested
      format where possible; see DALI for details."

   DALI 1.1 Sec 4.2:
      "An error document describing errors in use of the DAL service
       protocol may be a VOTable document (...) or a plain text document."

For an expected VOTable result, this is effectively a downgrade of a
MUST to a SHOULD for VOTable as the format of the error document.

So a TAP 1.0 client will expect a VOTable error document from a failed
async TAP query, but may get tripped up by receiving instead a plain
text error document.  This change is not explicitly noted in
Appendix A of TAP 1.1 "Changes from TAP-1.0 to TAP-1.1",
though A.1 says, somewhat misleadingly, "Removed text that duplicates
material from DALI.".

This looks to me like an oversight since it is (from the client's
point of view) a backwardly incompatible behaviour change from TAP 1.0
to TAP 1.1.

Has anybody come across this before, or remember whether it was
deliberate?

Mark

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk<mailto:m.b.taylor at bristol.ac.uk>          https://www.star.bristol.ac.uk/mbt/

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk<mailto:m.b.taylor at bristol.ac.uk>          https://www.star.bristol.ac.uk/mbt/

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk<mailto:m.b.taylor at bristol.ac.uk>          https://www.star.bristol.ac.uk/mbt/

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk<mailto:m.b.taylor at bristol.ac.uk>          https://www.star.bristol.ac.uk/mbt/