Progress

Francois Ochsenbein francois at vizier.u-strasbg.fr
Mon Feb 23 08:42:43 PST 2009



Hi,

I'm really happy to see that there is now some convergence in
the way of finalizing the TAP standard (I feel very uncomfortable
when the discussion is lead by psychological reactions, so I
apologize for taking time to react)

I like Pat's new document, and hope the route to the final version
is now not far ahead. I enclose comments about this new version,
in a more or less order of importance (from my point of view :-)

1. About VOTable INFO (section 2.9.1) as mentionned before (end
   of last year), plain text will not be allowed directly within
   the INFO tag in VOTable1.2, but within a DESCRIPTION sub-element
   e.g.
   <INFO name="QUERY_STATUS" value="OK">
    <DESCRIPTION>Successful query</DESCRIPTION>
   </INFO>
   (the textual comment is not essential in this example anyway :-)

2. The TAP_SCHEMA: I still feel it's incomplete and could be improved:

   a) in TAP_SCHEMA.tables: 
      * what is 'output' table_type is not obvious (table created 
        on the fly e.g. by running a model ? )
      * some kind of standardized description seems to be missing:
        with 20,000 tables the selection of the interesting tables
	does not look obvious on the basis of this schema. In the 
	case of vizier we have keywords, wavelength domain (exists
	in VOResource), usual abbreviations (exists in VOResource),
	relations between tables (tables belonging to the same 
	'astronomical catalog')
      * some statistical properties are quite important
        for applications (number of tuples at least approximative,
	sky coverage if relevant, to quote those existing in
	VOResource)

   b) in TAP_SCHEMA.columns: 
      * "primary" and "std" : I'm not sure to understand what it
         means. I guess they have to do with the 'importance' of
	 a column ? In this case, these attributes are similar
	 to the 'VERB' option of the ConeSearch, and these two
	 attributes could be replaced by a single one ('visibility'
	 or 'importance').
	 Unless 'std' stands for 'non-virtual' (i.e. actual columns
	 in database, vs columns computed before delivering the output)?
      * "indexed" does not always apply to a single column --
         indexes may be built on a set of columns (e.g.
	 index on (dec, ra))
      * domains are missing (domains describe the possible
        values of a column, as a range or a list of values, 
	and the existence of null/blank in columns). Quite
	useful for applications, too.

   -- important properties of the relational tables are missing
      from the TAP_SCHEMA, especially the properties of unicity
      (keys of the relational schema). As in the case of indexes,
      keys are defined as a combination of columns in a table,
      and therefore can't just be an extra attribute of 
      TAP_SCHEMA.columns

3. Following Bob's comments, I would also prefer another word
   than 'LANG' to specify the way the query is coded. QUERYMETHOD
   looks pretty clear, even though it seems somewhat long...

4. For the output format (section 2.3.6) I'm wondering why 
   tab-separated-values is not included -- generally easier
   to deal with, as many columns do include commas in their
   contents, while control characters like tab are (generally)
   not existing in table columns. In the final document,
   it would be useful to write down the mime type corresponding
   to each format.
   
5. Finally, I feel there should be some words about TAP and
   ConeSearch -- after all ConeSearch is doing a fraction of
   what is described in the TAP document.

Thanks to all contributors !
Francois
=======================================================================
Francois Ochsenbein    ------   Observatoire Astronomique de Strasbourg
   11, rue de l'Universite 67000 STRASBOURG  Phone: +33-(0)390 24 24 29
Email: francois at astro.u-strasbg.fr (France)    Fax: +33-(0)390 24 24 17
=======================================================================



More information about the dal mailing list