vot-cli: VOTable metadata edition and format conversion on the command-line
Francois-Xavier PINEAU
francois-xavier.pineau at astro.unistra.fr
Mon Apr 15 17:55:30 CEST 2024
Dear App,
This email to bring recent changes in vot-cli to your attention.
vot-cli is a young IVOA command line tool to convert VOTable in both
standard
and non-standard formats and to edit VOTable metadata.
In short, see:
https://github.com/cds-astro/cds-votable-rust/tree/main/crates/cli
# Context
A new VOTable parser, in Rust, was presented at the IVOA meeting in
Bologna a year ago.
Back then, a limited CLI demonstration tool existed to convert small
VOTables
back-on-forth in standard XML + non-standard JSON/TOML/YAML while
preserving all information
(except comments) and the order of the VOTable elements.
A year later (now), the code has been partly restructured and cleaned up,
bugs and a few misconceptions (read the full XSD, do not trust the
simplified diagrams!)
have been fixed and new features have been added.
And (the object of this email) an enhanced version of vot-cli was
released a few days ago.
# Motivation
Firstly, both the library and the CLI tool meet internal needs (for
example, we use vot-cli to
convert VOTable metadata into TOML that we then edit both manually and
by script).
In addition, vot-cli has been made to:
* document some possible usages of the Rust VOTable API
* evaluate performances with respect to existing tools
* be used in discussions about what we want/do not want if the IVOA
decided to make a JSON version of VOTable
* ...
# vot-cli features
First, vot-cli:
* *is not* a VOTable validator and is (so far) VOTable version agnostic
+ (for a strict validator, see
https://www.star.bris.ac.uk/~mbt/stilts/sun256/votlint.html)
**does not* check UCDs, units, ... validity
**does not* support fancy XML like declared entities
but it allows to:
* *convert in-memory* fitting VOTable from/to
standard*TABLEDATA/BINARY/BINARY2* + non-standard *TOML/JSON/YAML*
* *convert large single table *VOTable*in streaming* from
TABLEDATA/BINARY/BINARY2 to TABLEDATA/BINARY/BINARY2/CSV
+ e.g. *5 seconds *to convert a *2.8 GB* TABLEDATA file to a 1.1 GB
CSV file (on a server with an SSDs raid and using *multi-threading*)
* retrieve a 'structure' view of a votable efficiently
* *edit metadata* information (add/update/rm almost any tag except the
DATA part) efficiently
vot-cli will continue to evolve, depending on the needs.
More information (install instructions, help messages, examples, ...) here
==> https://github.com/cds-astro/cds-votable-rust/tree/main/crates/cli <==
# Main alternatives I am aware of
* astropy: one can do almost everything he wants on a VOTable using python
+ main limitations:
- not a CLI: you have to code
- not (yet) possible to deal with a very large table /
performances ?
* STILTS: one can convert any file (even very large and multi-table)
efficiently, with a lot of options
+ main limitations:
- if data/metadata editing is required, VOTable structure may
change (see votcopy vs tpipe)
- editing of VOTable-specific metadata is not always possible
(e.g. adding LINKs; see tpipe with colmeta and setparam for
more details)
# vot-cli main limitations
* not as tested and robust as astropy and STILTS (especially on arrays,
rare datatypes, fancy XML usages)
* no streaming mode (yet) for multi-table files (is it a common use case?)
* no edition of the data part (cannot remove/add/... columns)
* do not support conversion reading/writing external FITS files
* ...
# About performances
The single threaded version of vot-cli shows performances similar to
stilts votcopy.
The multi-threaded version, `vot sconvert`with `--parallel` option, may
increase
performances up to x10 depending on the hardware, the type of
conversion, the votable, ...
Hoping to have aroused the curiosity of a few people...
fx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/apps/attachments/20240415/6c506f8d/attachment.htm>
More information about the apps
mailing list