TAP1.0 Comments

Tue Jul 14 06:39:22 PDT 2009

>
>On Mon, 13 Jul 2009, Francois Ochsenbein wrote:
>
>> First, the question of TAP result in a single *table* : Alberto's
>> question is quite right, and I'm afraid the reduction of the
>> result to a single table will generate problems for us (vizier)
>> and likely for other services. Yes the relational model implies
>> that the result of any query is a single table -- but sticking
>> to this means that queries like "give me all objects from any
>> table this region of the sky" is not possible. Such questions
>> however are quite frequent... How to deal with those ? I see
>> only the following alternatives if TAP sticks to a single
>> output table:
>> a. the client asks for tables existing in the service;
>>   upon the answer (7896 tables), the client generates
>>   7896 queries. Not really realistic :-(
>> b. the server creates some kind of minimal common schema
>>   between all these tables -- in practice this can only be
>>   the position and the table name (i.e. a 3 column table).
>>   But then you have to get more details about each result,
>>   details concerning data and as well as metadata.
>>   Therefore you still have to generate many 'children' queries.
>>
>> Or should services like vizier give up with TAP ?
>
>This is an important use case, but not really a conventional (relational)
>table access problem.  It is getting more into the domain of the other
>DAL services which have data models.  Some possible approaches:
>
>     o	For this specific case (find tables with data in some region) PQL
> 	could be used since it has a data model.  For example, query
> 	TAP_SCHEMA.tables with POS,SIZE or REGION specifying the region
> 	of interest.  Other simple constraints could be specified as well.
>
>     o	More generally we could use the Generic Dataset (GDS) query.
> 	The GDS (Observation) data model can describe any kind of
> 	dataset, including tables (also images, spectra, etc.).  So if
> 	Vizier provides a global index table based upon the GDS model
> 	it could be queried with either PQL or ADQL in TAP.
>
>     o	A footprint service could also be used, although this is much
> 	the same here as a GDS query using REGION.
>
>In both of these cases the response is a single table.	In the first
>case it contains TAP_SCHEMA.tables metadata.  In the second case it
>contains GDS metadata providing a richer description of the tables,
>with the possibility of data links pointing to either the table files
>(if small) or to services which can be used to access the data.

Doug,

What is the Generic Dataset (GDS) query ? Where is it described ?
I couldn't find any note or document describing this... 

I can't see either how a footprint would solve the problem if you 
are looking in very small regions (e.g. a circle of 5arcsec around
a position) -- the only footprint I can imagine which could work is
a union of all the positions contained in the original catalogs;
otherwise I don't see how your "solutions" differ from my point b. ...

>
>> 2.3.5: it looks strange for me that constraints can be ignored in PQL.
>>       If a table is queried with just a contraint on TIME, and there
>>       is no time in the table, the fact that this parameter is
>>       ignored results in a dump of a (potentially very large) table.
>>       Similarly for POS query (section 1.1.5) -- if the table
>>       queried has no position, is it really a good solution to
>>       return the whole table ? Hopefully this is not possible
>>       with ADQL :-)
>
>Again, I think people misunderstand what was meant by this.  We should
>just remove this from PQL as it is specific to the semantics of SIA/SSA
>whereas PQL is a table query interface.  When querying an actual table
>the semantics want to be precise.   This is different from global data
>discovery in SIA or whatever where the same query is posed to many
>services, each of which may provide a different subset of metadata.
>Precise queries cannot easily be used in such a case, rather we need an
>iterative query which is what the S*AP interfaces provide.

===> I was talking of the TAP document where this is written.
     Should therefore this remark be dropped also from the TAP document ?

>> 2.3.8: MTIME -- I still have problems with this. A service may have
>>      some tables which have such timestamp columns (typically
>>      TAP_SCHEMA tables) while other tables have not this information.
>>      I can't therefore see this feature as a service-wide feature,
>>      and the MTIME capability would need to be specified in
>>      the TAP_SCHEMA (section 2.6.2)
>
>MTIME is supposed to be a parameter query, hence it need not specify
>how update/delete/add metadata is maintained internally.

===> ... but at least it would be important to know for which tables 
     (or none or everyone) this parameter can be meaningful ?

--Francois
=======================================================================
Francois Ochsenbein    ------   Observatoire Astronomique de Strasbourg
   11, rue de l'Universite 67000 STRASBOURG  Phone: +33-(0)390 24 24 29
Email: francois at astro.u-strasbg.fr (France)    Fax: +33-(0)390 24 24 17
=======================================================================