Convert unicode index to long, not int, in get_char_index #7768

AdamWill · 2017-01-09T01:25:27Z

There's an error in the PyFT2Font.get_char_index() method
added in 2d56ffeb . The type for the unicode index to be sent
to FT_Get_Char_Index is FT_ULong - an unsigned long - but
the PyArg_ParseTuple call that converts it from Python used
I in the format string, which converts a Python int to a C
unsigned int, not a C unsigned long. This doesn't seem to cause
a problem on little-endian arches, but it results in completely
incorrect conversion on big-endian arches, which in turn would
result in wrong glyphs, unfound glyphs, and even in an infinite
recursion in UnicodeFonts._get_glyph.

To get correct conversion we must use k not I, which is
the specifier for a C unsigned long.

Ref: https://docs.python.org/3/c-api/arg.html#numbers

There's an error in the `PyFT2Font.get_char_index()` method added in 2d56ffeb . The type for the unicode index to be sent to `FT_Get_Char_Index` is `FT_ULong` - an unsigned long - but the `PyArg_ParseTuple` call that converts it from Python used `I` in the format string, which converts a Python int to a C unsigned int, not a C unsigned long. This doesn't seem to cause a problem on little-endian arches, but it results in completely incorrect conversion on big-endian arches, which in turn would result in wrong glyphs, unfound glyphs, and even in an infinite recursion in `UnicodeFonts._get_glyph`. To get correct conversion we must use `k` not `I`, which is the specifier for a C unsigned long. Ref: https://docs.python.org/3/c-api/arg.html#numbers

AdamWill · 2017-01-09T01:34:25Z

BTW, I figured this out by modifying mathtext.py to log the uniindex value it was sending to get_char_index, and modifying get_char_index to log the ccode value, which obviously should be the same. On a big-endian arch (ppc64), it was not the same at all. This change makes it work - the Python and C functions log the same values, and all the conversion errors I was seeing before go away.

AdamWill · 2017-01-09T01:59:44Z

@mdboom (author of 2d56ffe)

QuLogic

Makes sense; on LE arches, the first 4 bytes would be filled, and assuming a zeroing stack, it should result in the same value. On BE arches, the first 4 bytes are the top-most bytes, so everything would be multiplied by 2**32.

Since it's an unsigned long, probably this worked fine on 32-bit BE arches (if any exist and were used.)

AdamWill · 2017-01-09T02:21:48Z

Note, there still appear to be several hundred other errors in the test suite on a ppc64 build. It seems like it would be a good idea to set up the CI to run the tests on a 64-bit be arch, if at all possible, so all those problems can be fixed and future ones won't be introduced...

@QuLogic yeah, that adds up. The incorrect values were indeed very large.

QuLogic · 2017-01-09T04:12:06Z

I'm not sure there are any GitHub integrations that support ppc64; maybe copr, but I never understood how their webhooks worked.

codecov-io · 2017-01-09T05:31:11Z

Current coverage is 62.12% (diff: 100%)

Merging #7768 into master will not change coverage

@@             master      #7768   diff @@
==========================================
  Files           174        174          
  Lines         56028      56028          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits          34805      34805          
  Misses        21223      21223          
  Partials          0          0

Powered by Codecov. Last update eba130f...c4c0b65

AdamWill · 2017-01-09T06:33:32Z

hah, well, copr is a Fedora thing, so I guess that'd be my department...I could maybe talk to some people and see if we can hook anything up, I'll try and remember to give it a shot.

dopplershift

Good catch!

QuLogic · 2017-01-09T21:37:23Z

Please backport when merging.

On Jan 9, 2017 4:06 PM, "Ryan May" ***@***.***> wrote:

Merged #7768 <#7768>.

tacaswell · 2017-01-09T22:06:34Z

Is there a public log of the test results? iirc, from looking at the failures from debian's build system any of the failures are in the spectra tests in mlab which are very sensitive to small differences in floating point math.

AdamWill · 2017-01-09T22:22:22Z

Well, I did my test ppc64 build with the test suite enabled in a sandbox that fedora releng set up for me, but I can run it again and grab the log and put it up somewhere, I'll do that.

A more long-term solution might be, as @QuLogic suggested, to set up a COPR repository which has a simple spec (and uses the bundled freetype and so on, not like the official Fedora package build) and just tries to run a build every time there's a commit to this repo (or just does it every day or something).

AdamWill · 2017-01-09T23:18:06Z

Here's the build log of the ppc64 build with tests enabled:

build.log.txt

Looking at that, I'm seeing some more fun issues with integer types...digging into that right now. The freetype FT_Int32 and FT_UInt32 types (that are used for a couple of flags) are theoretically problematic as they can be int or long, but in practice they'll almost always be int, I think.

Convert unicode index to long, not int, in get_char_index

dopplershift · 2017-01-09T23:45:13Z

Backported to 2.x in bffe631

QuLogic · 2017-01-10T03:36:57Z

The freetype FT_Int32 and FT_UInt32 types (that are used for a couple of flags) are rather problematic as their type varies by arch, I think; I think they are ints on 64-bit arches and longs on 32-bit arches...

The preference seems to be (unsigned) int unless it's not 32-bit, but unless we're talking about embedded stuff, I'm not sure Fedora runs on anything that uses a 16-bit int.

QuLogic · 2017-01-10T03:52:08Z

@AdamWill BTW, that build is with beta4; you should try with rc2. I believe there should be fixes for several of those size warnings.

AdamWill · 2017-01-10T07:16:20Z

@QuLogic yeah, I'm building beta4 for now just because the package spec has several patches applied that don't apply cleanly to rc2.

#7781 cuts the failure count down to 38, for me.

QuLogic approved these changes Jan 9, 2017

View reviewed changes

QuLogic added this to the 2.0 (style change major release) milestone Jan 9, 2017

dopplershift approved these changes Jan 9, 2017

View reviewed changes

dopplershift merged commit b0e4b67 into matplotlib:master Jan 9, 2017

dopplershift added a commit that referenced this pull request Jan 9, 2017

Merge pull request #7768 from AdamWill/charindex-type

bffe631

Convert unicode index to long, not int, in get_char_index

AdamWill mentioned this pull request Jan 10, 2017

Colorbars contain no colors when created on ppc64 (big-endian) #7788

Closed

Uh oh!

Convert unicode index to long, not int, in get_char_index #7768

Convert unicode index to long, not int, in get_char_index #7768

Uh oh!

Conversation

AdamWill commented Jan 9, 2017

Uh oh!

AdamWill commented Jan 9, 2017

Uh oh!

AdamWill commented Jan 9, 2017

Uh oh!

QuLogic left a comment

Choose a reason for hiding this comment

Uh oh!

AdamWill commented Jan 9, 2017

Uh oh!

QuLogic commented Jan 9, 2017

Uh oh!

codecov-io commented Jan 9, 2017

Current coverage is 62.12% (diff: 100%)

Uh oh!

AdamWill commented Jan 9, 2017

Uh oh!

dopplershift left a comment

Choose a reason for hiding this comment

Uh oh!

QuLogic commented Jan 9, 2017 via email

Uh oh!

tacaswell commented Jan 9, 2017

Uh oh!

AdamWill commented Jan 9, 2017

Uh oh!

AdamWill commented Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dopplershift commented Jan 9, 2017

Uh oh!

QuLogic commented Jan 10, 2017

Uh oh!

QuLogic commented Jan 10, 2017

Uh oh!

AdamWill commented Jan 10, 2017

Uh oh!

Uh oh!

AdamWill commented Jan 9, 2017 •

edited

Loading