Request for VOTable code review
Tom Donaldson
tdonaldson at stsci.edu
Mon Feb 3 15:24:24 CET 2020
Dear IVOA,
I'm still looking for one (or more) volunteer(s) to do a code review of this astropy.io.votable pull request:
https://github.com/astropy/astropy/pull/9505
I definitely thank Markus for the comments, indeed this PR made it clear that we have some work to do on specifying how strings are serialized in VOTable. If he or anyone else can spare a few cycles to look through the code and enter a review on github, I'd be most grateful.
The VOTable code does take a little learning curve at first, but anyone with both Python and VOTable familiarity should be fine. Please feel free to ask any questions here or on the github PR.
The set of people who know both Python and one or more VO standards is a limited resource, but one that will be very important moving forward to keep astropy and pyvo features robust and current with VO standards, and to bring in new features and reference implementations to pyvo. If you or your institution can spare some cycles for this effort, that would be terrific!
Best Regards,
Tom
On 1/8/20, 3:41 PM, "apps-bounces at ivoa.net on behalf of Markus Demleitner" <apps-bounces at ivoa.net on behalf of msdemlei at ari.uni-heidelberg.de> wrote:
External Email - Use Caution
Dear Apps,
On Wed, Dec 18, 2019 at 04:08:50PM +0000, Tom Donaldson wrote:
> Astropy and I would greatly appreciate if someone could have a look
> at this code, and enter a review here:
> https://github.com/astropy/astropy/pull/9505
I've put in an informal comment for now, in particular pointing to
our previous discussion on allowing utf-8 in votable char (which I
still think is a good idea):
http://mail.ivoa.net/pipermail/apps/2014-October/001010.html
In sum: I'm convinced exposing char[] as strings rather than bytes is
absolutely the right thing to do, and they'd even have my vote for
decoding from utf-8 rather than (VOTable-correct) ASCII.
> - Lack of direction on encoding
> - Inconsistency on sizing between TABLEDATA and BINARY serializations
...which, incidentally, is something we don't get around, and that we
already have with unicodeChar (no XML document I've ever seen uses
UCS-2, but it's what we require in BINARY2; I'll mention in passing
that UCS-2 these days isn't part of unicode any more and then pretend
I hadn't said that).
There is, however, a more sinister question here (related, but it
shouldn't block the astropy PR): What do you serialise python3
strings *into*? Since you can't be sure that there's just ASCII in
these, it can't blindly be char[]. On the other hand,
unicodeChar[] as a VOTable type isn't pretty either, starting with
wasting one byte per char in >>99% of the strings in use in
astronomy.
Since I don't have an idea for how to solve this that I like: Does
anyone here know of an elegant solution to this (i.e., have nice,
compact chars by default but let users say "I want non-ASCII here,
really" where necessary) somewhere?
-- Markus
More information about the interop
mailing list