TAP1.0 Comments

Wed Jul 15 04:38:32 PDT 2009

Hi Francois -

I am on travel this week so may be slow responding to email.

On Tue, 14 Jul 2009, Francois Ochsenbein wrote:
> What is the Generic Dataset (GDS) query ? Where is it described ?
> I couldn't find any note or document describing this...

The GDS query is essentially a way to query an index table describing
"generic datasets" - essentially the same thing as the Observation
data model, which is known as the generic dataset (GDS) in the DAL
interfaces (since it can also describe theory data we didn't want to
call it an "observation").  Each record describes a single primary
dataset (image, spectrum, table, whatever) using standard metadata.
Data linking can be used to get the dataset or find a service to use
to access it.

This was discussed in the interop, e.g. look at the last few slides
in the following talk I gave at the interop (this doesn't have much to
do with siav2, I just put the few slides for the GDS talk at the end):

     http://www.ivoa.net/internal/IVOA/200905DALSessions/siapv2-may09.pdf

Francois Bonnarel is also involved in this and can fill you in locally.
The generic dataset stuff is described more generally in the DAL2
architecture document since the generic dataset is the basis for all
the DAL services.

Basically what this would allow you to do is build an index table for
all the 8000 or so Vizier tables.  It could be queried with a GDS query
(in TAP probably) to find the tables of interest.  These would then
be queried individually with TAP.  Aside from being simpler, I think
this is more convenient and flexible for the client than getting all
the query responses in one big file from which they would then have
to be extracted.  If we really wanted that it could be done with an
integrator service of some sort.

> I can't see either how a footprint would solve the problem if you
> are looking in very small regions (e.g. a circle of 5arcsec around
> a position) -- the only footprint I can imagine which could work is
> a union of all the positions contained in the original catalogs;
> otherwise I don't see how your "solutions" differ from my point b. ...

I agree, a footprint service is not really what is needed here.

>>> 2.3.5: it looks strange for me that constraints can be ignored in PQL.
>>>       If a table is queried with just a contraint on TIME, and there
>>>       is no time in the table, the fact that this parameter is
>>>       ignored results in a dump of a (potentially very large) table.
>>>       Similarly for POS query (section 1.1.5) -- if the table
>>>       queried has no position, is it really a good solution to
>>>       return the whole table ? Hopefully this is not possible
>>>       with ADQL :-)
>>
>> Again, I think people misunderstand what was meant by this.  We should
>> just remove this from PQL as it is specific to the semantics of SIA/SSA
>> whereas PQL is a table query interface.  When querying an actual table
>> the semantics want to be precise.   This is different from global data
>> discovery in SIA or whatever where the same query is posed to many
>> services, each of which may provide a different subset of metadata.
>> Precise queries cannot easily be used in such a case, rather we need an
>> iterative query which is what the S*AP interfaces provide.

> ===> I was talking of the TAP document where this is written.
>     Should therefore this remark be dropped also from the TAP document ?

I see; I was referring to the PQL document.  I do not think we should
be describing this level of detail re PQL in the main TAP document;
we should leave the details to the PQL spec and only introduce PQL
in the main TAP spec (the most important thing is to describe how
PQL is invoked from within the TAP interface).

Yes, this topic should be removed from the TAP doc and probably from
section 2 of the PQL doc as well.  These semantics mainly belong to
global data discovery in the SIA/SSA etc. interfaces.  We may need
to consider them again for the GDS query however (probably so as the
missing metadata problem will come again there).

>>> 2.3.8: MTIME -- I still have problems with this. A service may have
>>>      some tables which have such timestamp columns (typically
>>>      TAP_SCHEMA tables) while other tables have not this information.
>>>      I can't therefore see this feature as a service-wide feature,
>>>      and the MTIME capability would need to be specified in
>>>      the TAP_SCHEMA (section 2.6.2)
>>
>> MTIME is supposed to be a parameter query, hence it need not specify
>> how update/delete/add metadata is maintained internally.
>
> ===> ... but at least it would be important to know for which tables
>     (or none or everyone) this parameter can be meaningful ?

True.  MTIME should be an optional advanced capability of any TAP (or
DAL) service, so it will not even be an issue for many simpler services.
For services which provide MTIME we could describe which tables are
supported, or require that it be supported for all data tables.

Supporting MTIME probably does not require any (externally visible)
changes to the tables; the one or two extra columns MTIME requires on
output of an MTIME query can probably be added on the fly, and only
for MTIME queries.  At least, that was what we had in mind originally.

 	- Doug