Statistics metadata in TAP
Thomas Boch
thomas.boch at astro.unistra.fr
Fri Oct 21 09:15:05 CEST 2016
Hi Gregory,
I like the idea and I also like your histograms.
Did you think about how you could expose the histograms through TAP?
Cheers,
Thomas
Le 20/10/2016 à 23:07, Gregory Mantelet a écrit :
>
> Dear DAL and Apps members,
>
> Since I do not attend to this interop, I would like to highlight
> quickly one of my last development concerning TAP because I think it
> may be interested to either do the same in your own TAP service or
> merely use it. As suggested by the title of this email it is about
> adding metadata in TAP.
>
> (I send this email also to Apps because of the last point I make in
> this email: a compatibility with a new feature of TOPCAT)
>
>
> ** Columns metadata
>
> The idea is to add basic statistics like a count, min, max, ... for
> some numerical columns of tables published in a TAP service. For that,
> I have just added the following columns in TAP_SCHEMA.columns:
>
> - min_value
> - max_value
> - mean
> - std_dev
> - q1 (i.e. first quartile)
> - median (i.e. second quartile)
> - q3 (i.e. third quartile)
> - filling (number of rows having a NOT NULL value for this
> column)
>
> Except for "filling" which must be an integer (INTEGER or BIGINT in
> PostgreSQL), I have chosen to set all these columns as DOUBLE
> PRECISION since most of the columns to describe are, in the "worst"
> case, double values.
>
> When no statistics can be provided for a column, all these additional
> metadata would be NULL.
>
>
> ** Tables metadata
>
> In addition, I have also added another column in TAP_SCHEMA.tables:
>
> - row_count (of type INTEGER or BIGINT)
>
>
> ** VOSI description of tables
>
> Since in TAP all tables and columns metadata MUST be the same in
> TAP_SCHEMA and /tables, I have also updated our /tables resource.
>
> Besides, on a recommendation of Mark Taylor, I designed and linked a
> simple XSD schema in order to have a valid XML document. You can find
> this schema at the following address:
>
> http://gaia.ari.uni-heidelberg.de/tap-stats.xsd
>
>
> ** Visibility in TOPCAT
>
> Thanks to Mark Taylor, any custom metadata (non-standard TAP columns)
> can be displayed in the last version of TOPCAT. Thus, all the
> statistics described above can be seen there for our Gaia TAP service
> (n.b. you can find this TAP service easily in the registry with the
> keywords "Gaia" and "ARI", but in case you can not, here is the root
> access URL: http://gaia.ari.uni-heidelberg.de/tap).
>
>
> ** Last words...
>
> According to me all these basic statistics may be useful to discover
> the content of a table, especially when this one is as large as Gaia,
> PPMXL, 2MASS, ... It can indeed prevent some users to perform
> apparently simple and short queries such as "SELECT COUNT(*) FROM
> a_big_table" which on the contrary to what most people think is not
> often a quick query on large tables. Having already computed such
> information is then an economy of time and resources for the users and
> the server.
>
> Finally, I am not trying to convince anybody to have such metadata,
> but I just want to highlight a possible extension of TAP helping in
> simple data discovery. Besides, this use-case also demonstrates how
> easy it could be to add more simple metadata inside a TAP service. So
> maybe it could be interested, if possible, to write an appendix about
> that in the next version of TAP or just as an IVOA note. What do you
> think?
>
> If anybody has questions or wants further details about the TAP
> "extension" I presented here, do not hesitate to ask ; I am not at the
> interop, but I am fully available by email
>
> Regards,
> Grégory
>
>
> PS: For those who are interested, I also provide histograms and
> sky-maps (using Healpix) for most of the published columns on the page
> http://gaia.ari.uni-heidelberg.de/tap/tables. Both can be downloaded
> as images but also as tables that you can then display/process as you
> want (e.g. display the histogram in TOPCAT, display and navigate
> inside the Healpix map in Aladin, ...).
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20161021/40a6429f/attachment.html>
More information about the dal
mailing list