[TAP] draft 0.42

Douglas Tody dtody at nrao.edu
Wed Apr 29 16:44:53 PDT 2009


On Mon, 27 Apr 2009, Francois Ochsenbein wrote:

> ==> the usage of the <INFO> tag (sec. 2.9): should we (I mean the
>      VOTable group) give up with the addition of sub-elements
>      like <DESCRIPTION> in <INFO>, which are included in VOTable1.2 ?

Objections to this change have already been stated so I won't repeat
them here.  As I noted earlier, in principle it does not matter as
VOTable is a versioned serialization and the logical model is not
affected (in practice implementations are likely to be impacted, at
least over the next year or so).

As Pat notes the examples in the document can be changed to reflect
VOTable 1.2.  However, we should add a mention that the syntax shown
will differ for earlier versions of VOTable.  I do not think that TAP
should require VOTable 1.2.

> ==> the 'format' as a new attribute of <FIELD> (and therefore
>      of <PARAMETER>): I assume this word is replaced by 'encoding',
>      with a similar meaning of what is proposed in Apx A.5 of
>      VOTable1.2. Is that agreeable ?

Ok with me if we add this new feature.  As others have noted "encoding"
suggests something else.

> 2.3 2nd (last) paragraph:
>    it may be difficult to decide which is a spurious parameter -- in a
>    set of n parameters, it is likely possible to find several subsets
>    which would represent a valid tap request. Or are you talking about
>    empty parameters only (cf 2.3.11)? The exact situation should be
>    clarified, in my opinion.

This is operation specific.  It is one of the reasons that we need
to specify each service operation separately - what parameters does
it use/permit, and what special semantics does it define.

> 2.3.4 + 2.3.5 :
>    as you suggest, most of the section 2.3.5 would better be
>    in a PQL document.
>
>    Or alternatively the TAP document could have a section dedicated to
>    the queries by (celestial) position and maybe another section to
>    queries by time. Queries based on position (and time) are such an
>    important part of TAP that a few typical examples, in both ADQL and
>    PQL, would be helpful for both TAP servers and clients (consumers).

I think this level of detail probably needs to be defined separately
for ADQL and param queries.  As you say the details should be addressed
in the respective documents.

The details of param queries have already been moved to a separate
document.  What is missing is an equivalent, more detailed description
of how to do ADQL queries with TAP.  Note the ADQL syntax, which is
already documented in the ADQL standard, but ADQL usage with TAP.

> 2.3.6 : why is tab-separated-values not considered ?
>    tab-separated-values is a mime type which exists since the
>    beginning of the web, and is much more simple than CSV
>    (does not require optional escaping, but forbids documents
>    having tabs in their columns). See also 2.7.1

I tend to agree that TSV is probably better than CSV, but CSV is
the more widely defined external standard.  I was the one who originally
suggested CSV, but I admit I am conflicted.  Since CSV is optional we
could consider adding TSV as an optional format.  Both are pretty 
trivial (but very useful) in any case.

> 2.3.9 :
>    I really do not understand why MTIME is introduced in TAP, it
>    introduces a lot of complications and exceptions in the definitions
>    of the TAP service; I believe the TAP protocol should stay as generic
>    as possible. Isn't it possible to define alternatively a specific
>    schema (or data model?) for logs or 'mirrorable' tables ?

MTIME is an optional advanced capability in all the DAL2 services (it
is already in SSA), intended to be used to maintain replicas of large
data holdings or catalogs.  We have advanced use-cases which require
this and there is no simple alternative.  Since it is needed, and is
optional, and already a standard, I see no reason to not have this in
TAP.  It would be reasonable to restrict it to the param query however
as that is sufficient to support the forseen use cases, and ADQL really
does not need this.

> 2.3.11 :
>    my feeling is that MAXREC= (parameter without value) differs
>    fundamentally from asking for a default or null value, and is
>    related to 2.3 (2nd paragraph): is it a spurious parameter or
>    a request for a default value ? My own opinion is that parameters
>    without contents would better be systematically ignored
>    (i.e. http://my.tap-server/tap/sync?REQUEST=doQuery&MAXREC=
>     and  http://my.tap-server/tap/sync?REQUEST=doQuery
>     represent the same TAP query -- that would be the logical
>     interpretation of what comes out from a HTTP form)

I agree.  This should also be addressed in the DAL2 standard service
profile, and implemented consistently in all the DAL2 services.

> 2.3.13 :
>    is there a well-defined way of dealing with multi-valued
>    parameters (your TBD) ? What would happen if you specify
>    first MAXREC=100, and somewhere else in the same query MAXREC=0 ?
>    A duplication of parameters like MAXREC or RUNID should,
>    in my opinion, generate an error.

SSA and DAL2 forbid having the same parameter appear multiple times
in a query.  This caused problems with the legacy services and the
feature is not needed so it is probably best to just forbid it.
Some would argue that we follow HTTP usage and have this specify
multiple values for a parameter.  However as has been noted elsewhere
the logical service protocol is not HTTP-specific.  If we permitted
this it would be part of the logical service protocol and every
level of software, all the way up to an application, would have to
support it - there is not sufficient justification for the feature.
In any case, list-structured parameters (comma delimited list values)
already provide a more powerful mechanism for parameters which permit
multiple values of a parameter.  Since this is done as the string
value of the parameter, it is transparent at most levels of software
which manipulate parameter values.

> 2.5.2 : table upload:
>    Is it restricted to VOTable ? And if yes, only to <TABLEDATA>
>   version of VOTable ?  Or shouldn't any of the output types
>   defined in 2.3.6 be acceptable ?

I think we should only define VOTable for uploads.  The other formats
do not provide sufficient information to fully define a DBMS table
from the upload table, plus we are still trying (hopefully not futily)
to avoid over-complicating the service implementations.  It is not
hard for to convert text or whatever to a VOTable on the client-side
before uploading.

> 2.6 : TAP_SCHEMA
>    as already repeated (not only by me!) the current definition is
>    incomplete for what concerns the actual table implementation
>    (keys and indexes). If the TAP_SCHEMA is limited to semantics
>    (i.e. remove the 'indexed' in the columns schema), it's OK.

I think there is agreement on modest enhancements to the TAP_schema
(details TBD), however we need to finalize the first version of this
and should do so after one more iteration.  This is one of the remaining
key issues still open.

 	- Doug



More information about the dal mailing list