Request for VOTable code review

Tom Donaldson tdonaldson at stsci.edu
Mon Feb 3 15:24:24 CET 2020


Dear IVOA,

I'm still looking for one (or more) volunteer(s) to do a code review of this astropy.io.votable pull request:
https://github.com/astropy/astropy/pull/9505

I definitely thank Markus for the comments, indeed this PR made it clear that we have some work to do on specifying how strings are serialized in VOTable.  If he or anyone else can spare a few cycles to look through the code and enter a review on github, I'd be most grateful.

The VOTable code does take a little learning curve at first, but anyone with both Python and VOTable familiarity should be fine.  Please feel free to ask any questions here or on the github PR.

The set of people who know both Python and one or more VO standards is a limited resource, but one that will be very important moving forward to keep astropy and pyvo features robust and current with VO standards, and to bring in new features and reference implementations to pyvo.  If you or your institution can spare some cycles for this effort, that would be terrific!

Best Regards,
Tom


On 1/8/20, 3:41 PM, "apps-bounces at ivoa.net on behalf of Markus Demleitner" <apps-bounces at ivoa.net on behalf of msdemlei at ari.uni-heidelberg.de> wrote:

    External Email - Use Caution
    
    Dear Apps,
    
    On Wed, Dec 18, 2019 at 04:08:50PM +0000, Tom Donaldson wrote:
    > Astropy and I would greatly appreciate if someone could have a look
    > at this code, and enter a review here:
    > https://github.com/astropy/astropy/pull/9505
    
    I've put in an informal comment for now, in particular pointing to
    our previous discussion on allowing utf-8 in votable char (which I
    still think is a good idea):
    http://mail.ivoa.net/pipermail/apps/2014-October/001010.html
    
    In sum: I'm convinced exposing char[] as strings rather than bytes is
    absolutely the right thing to do, and they'd even have my vote for
    decoding from utf-8 rather than (VOTable-correct) ASCII.
    
    > - Lack of direction on encoding
    > - Inconsistency on sizing between TABLEDATA and BINARY serializations
    
    ...which, incidentally, is something we don't get around, and that we
    already have with unicodeChar (no XML document I've ever seen uses
    UCS-2, but it's what we require in BINARY2; I'll mention in passing
    that UCS-2 these days isn't part of unicode any more and then pretend
    I hadn't said that).
    
    
    There is, however, a more sinister question here (related, but it
    shouldn't block the astropy PR): What do you serialise python3
    strings *into*?  Since you can't be sure that there's just ASCII in
    these, it can't blindly be char[].  On the other hand,
    unicodeChar[] as a VOTable type isn't pretty either, starting with
    wasting one byte per char in >>99% of the strings in use in
    astronomy.
    
    Since I don't have an idea for how to solve this that I like: Does
    anyone here know of an elegant solution to this (i.e., have nice,
    compact chars by default but let users say "I want non-ASCII here,
    really" where necessary) somewhere?
    
             -- Markus
    



More information about the interop mailing list