[MRG+1]: Allow unicode in code and outputs #106

larsoner · 2016-03-14T01:47:55Z

Previously, our code had unicode in the stdout which gave:

../examples/preprocessing/plot_maxwell_filter.py is not compiling:
Traceback (most recent call last):
  File "/home/larsoner/custombuilds/sphinx-gallery/sphinx_gallery/gen_rst.py", line 474, in execute_script
    my_stdout = my_buffer.getvalue().strip().expandtabs()
  File "/usr/lib/python2.7/StringIO.py", line 271, in getvalue
    self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 36: ordinal not in range(128)

Modified a test that fails on master and passes on this PR.

Works on my system (™) on Py3k and Python 2.7.

Closes #18.
Closes #19.

larsoner · 2016-03-14T02:47:15Z

With the latest commit I get this:

        Adjusted coil positions by (μ ± σ): 1.2° ± 1.5° (max: 6.7°)
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 882, in emit
    stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 36: ordinal not in range(128)
Logged from file maxwell.py, line 1698

And the example continues to run, but I still have some issues :(

Titan-C · 2016-03-14T09:09:20Z

sphinx_gallery/gen_rst.py

@@ -124,7 +125,7 @@ def flush(self):
 """


-CODE_OUTPUT = """.. rst-class:: sphx-glr-script-out
+CODE_OUTPUT = u""".. rst-class:: sphx-glr-script-out


We start using in our files, so we don't miss a string

from __future__ import unicode_literals

I'm not quite sure what you mean. You mean in the modules you work on that use sphinx-gallery you've started putting in a bunch of those lines? Or you've started putting them in the sphinx-gallery code itself? So far we haven't needed to use them to have unicode work properly. Does it make them work better with sphinx-gallery, or is there some other advantage?

unicode_literals takes the need away to prefix every string with u, makes everything unicode. My first guess was to use that instead of going through each string prefixing the u. But it is not that perfect, and I could not get sphinx-gallery to run my examples with unicode_literals in gen_rst.py

Did you try putting it in while on this branch, or on a previous one? It might work okay with this one since the unicode reading is a bit more unified.

(I put in the __future__ line, removed all u' and u" and it seems to work fine)

Titan-C · 2016-03-14T09:18:58Z

I have still an old PR on unicode #19. What I rescue from it now is that file opening is using the python module codecs. Instead of loading the file as binary

with codecs.open(filename,  'w', 'utf8') as file_content:

larsoner · 2016-03-14T13:31:50Z

Nice, I'll try swapping that in

Titan-C · 2016-03-14T13:38:41Z

I tested this on my personal examples in python2. It is actually a pain to get unicode to work every time. Because it can fail when docstrings are extracted or while executing or when writing the rst file. And I'm not entirely sure how to work around this.

I also noticed that it makes a huge difference for the test case if one loads from disk the file or one gets the input from within the test script as in now.

larsoner · 2016-03-14T13:47:25Z

I'm not entirely sure how to work around this.

I think we need to make sure unicode is used properly everywhere. It's a bit annoying to get right but it should be possible.

I also noticed that it makes a huge difference for the test case if one loads from disk the file or one gets the input from within the test script as in now.

So I should add another test, then, I take it?

larsoner · 2016-03-14T13:57:39Z

I also noticed that it makes a huge difference for the test case if one loads from disk the file or one gets the input from within the test script as in now.

I don't quite get this actually. The test script writes a file to disk, and tests it. Shouldn't that cover the use case you mention? If not, can you make a small test that fails, and I can work on fixing it?

larsoner · 2016-03-14T15:43:25Z

Okay @Titan-C I switched to using codecs where possible. Tests pass over here an 2.7 and 3.4, and things render properly for our repo. Ready to go from my end.

If you see degenerate cases, could you try to turn them into failing tests, and either post them here or open a PR into my branch? Without seeing those bits of code, it will be hard for me to get it right.

agramfort · 2016-03-14T16:50:43Z

one travis build is not happy

larsoner · 2016-03-14T16:51:32Z

Yeah I saw that, but it looks unrelated...?

larsoner · 2016-03-14T16:51:51Z

(Some error with Pygments)

larsoner · 2016-03-16T14:45:40Z

@Titan-C the example renders fine now, see what you think

larsoner · 2016-03-16T14:46:18Z

examples/plot_quantum.py

+print('pass')
+
+###############################################################################
+# And then:


@Titan-C if you know something more intelligent to put here let me know :)

something very smart from my notes

# -*- coding: utf-8 -*- r""" ================================================= Some Quantum Mechanics, filling an atomic orbital ================================================= Considering an atomic single orbital and how to fill it by use of the chemical potential. This system has a four element basis, :math:`B = \{ \lvert \emptyset \rangle, \lvert \uparrow \rangle, \lvert \downarrow \rangle, \lvert \uparrow\downarrow \rangle \}`, that is the empty orbital, one spin up electron, one spin down electron and the filled orbital. The environment of the orbital is set up by an energy cost for occupying the orbital, that is :math:`\epsilon` and when both electrons meet a contact interaction corresponding to the Coulomb repulsion :math:`U`. Finally the chemical potential :math:`\mu` is what allows in the Grand canonical picture, to fill up our atomic orbital from a reservoir of electrons. The the simple Hamiltonian to model this system is given by: .. math:: \mathcal{H} = \sum_{\sigma=\uparrow,\downarrow} \epsilon c^\dagger_\sigma c_\sigma + Un_\uparrow n_\downarrow - \mu \hat{N} Here :math:`c^\dagger,c` creation and annihilation operators, :math:`n=c^\dagger c`, and :math:`\hat{N}=n_\uparrow+n_\downarrow`. This Hamiltonian is diagonal in the basis of particle number we have chosen earlier, as the basis elements are also eigenvectors. .. math:: \mathcal{H} \lvert \emptyset \rangle &= 0 \\ \mathcal{H} \lvert \uparrow \rangle &= (\epsilon - \mu) | \uparrow \rangle \\ \mathcal{H} \lvert \downarrow \rangle &= (\epsilon - \mu) | \downarrow \rangle \\ \mathcal{H} \lvert \uparrow\downarrow \rangle &= (2\epsilon - 2\mu +U) \lvert \uparrow\downarrow \rangle It is easy to see, that the system will prefer to be empty if :math:`\mu \in [0,\epsilon)`, be single occupied if :math:`\mu \in (\epsilon, \epsilon +U)` and doubly occupied if :math:`\mu > \epsilon +U`. For a more rigorous treatment, the partition function has to be calculated and then the expected particle number can be found. Introducing a new variable :math:`\xi = \epsilon - \mu`, and :math:`\beta` corresponding to the inverse temperature of the system. .. math:: \mathcal{Z} &= Tr(e^{-\beta \mathcal{H}}) = 1 + 2e^{-\beta\xi} + e^{-\beta(2\xi + U)} \\ \langle \hat{N} \rangle &= \frac{1}{\beta} \frac{\partial}{\partial \mu} \ln \mathcal{Z} """ import matplotlib.pylab as plt import numpy as np mu = np.linspace(0, 3, 800) for b in [10, 20, 30]: n = 2 * (np.exp(b * (mu - 1)) + np.exp(b * (2 * mu - 3))) / \ (1 + np.exp(b * (mu - 1)) * (2 + np.exp(b * (mu - 2)))) plt.plot(mu, n, label=r"$\beta={}$".format(b)) plt.xlabel(r'$\mu$ ($\epsilon=1$, $U=1$)') plt.ylabel(r'$\langle N \rangle=\langle n_\uparrow \rangle+\langle n_\downarrow\rangle$') plt.legend(loc=0) plt.show()

agramfort · 2016-03-16T14:54:22Z

thanks heaps @Eric89GXL !

Titan-C · 2016-03-16T17:08:18Z

sphinx_gallery/gen_rst.py

@@ -527,7 +531,7 @@ def execute_script(code_block, example_globals, image_path, fig_count,

        # Breaks build on first example error

-        if gallery_conf['abort_on_example_error']:
+        if gallery_conf.get('abort_on_example_error', True):


The default is false and is set up in gen_gallery.py

It's not always set -- in our tests, if I had something raise an error at an appropriate time, I got an error here for not having this property

It's not always set -- in our tests

You are right. It is not set in the default dictionary, but in the gallery build configuration. Certainly we want tests to fail immediately, but the gallery to continue the build even if some examples fail as the defaults. I'll have to keep track of this in #97, there I'm writing a helper function for the tests to set defaults.

One thing I do prefer is to have the defaults all in one place and not scattered in the code. For now I think we can leave it like that.

Yeah I agree about keeping them in one place. I'll add a comment that it should be unified if possible.

larsoner · 2016-03-16T23:19:19Z

@Titan-C merged your PR, any other comments or does it work correctly for you now?

Titan-C · 2016-03-17T08:42:24Z

It works for my known cases.
+1

Titan-C · 2016-03-17T08:43:53Z

Just came to my mind. Can you put a line about this on the CHANGES.rst file, please.

lesteve · 2016-03-17T10:28:41Z

sphinx_gallery/gen_rst.py

@@ -93,6 +97,18 @@ def flush(self):
        self.file2.flush()


+class MyBytesIO(BytesIO):


Better name for the class ? I don't have any great suggestion I am afraid ...

I couldn't think of a better one either, but if someone else has an idea I'm happy to change it. I used the My prefix because eventually it goes to a variable named my_buffer.

lesteve · 2016-03-17T10:37:42Z

doc/conf.py

@@ -291,7 +291,7 @@ def setup(app):
    # Do not pop up any mayavi windows while running the
    # examples. These are very annoying since they steal the focus.
    mlab.options.offscreen = True
-except ImportError:


Probably this should not be part of this PR. I seem to remember that elsewhere there was a except Exceptionsince importing some libraries can raise all sorts of exceptions (that's what the comment say).

I can change this directly in master.

I had to have it in order to test on my system, so yes please put it in master, I can open another PR, or just modify it here if you don't mind having the orthogonal change here

lesteve · 2016-03-17T10:49:08Z

My understanding is that your fix requires the example files to be encoded in utf-8. I am wondering how badly this assumption can backfire ...

My gut feeling is that this PR is a real improvement though.

larsoner · 2016-03-17T13:06:28Z

My understanding is that your fix requires the example files to be encoded in utf-8. I am wondering how badly this assumption can backfire ...

Well ASCII or UTF-8, yeah. Previously they had to be ASCII only, so it is an improvement even if it doesn't make it universal.

larsoner · 2016-03-17T13:10:30Z

.travis.yml

@@ -39,7 +39,7 @@ install:
        if [ "$PYTHON_VERSION" == "2.7" ]; then
          conda install --yes --quiet mayavi;
          conda upgrade --yes --all;
-          conda upgrade --yes pyface;
+          pip install --upgrade pyface;


This is also an orthogonal change, but was necessary to make the CIs happy...

larsoner · 2016-03-17T13:14:55Z

Comments addressed

larsoner · 2016-03-17T13:16:55Z

Moved the orthogonal Exception change to #107

lesteve · 2016-03-17T14:45:51Z

sphinx_gallery/gen_rst.py

+        super(MyBytesIO, self).write(data)
+
+    def getvalue(self):
+        return super(MyBytesIO, self).getvalue().decode('utf-8')


This class is actually a bit weird since it derives from BytesIO but .getvalue returns a string.

It seems like this is working for me:

class NonUnicodeFriendlyStringIO(StringIO): def write(self, data): if not isinstance(data, unicode): data = data.decode('utf-8') super(StringIO, self.).write(data)

and then later:

my_buffer = NonUnicodeFriendlyStringIO()

The conda virtualenv for Mayavi forces a version of pyface that clashes with sphinx. The manual update of pyface within conda is no longer enough to update to a new version that does not clash with sphinx. Thus the update is forced through pip. Mayavi is an experimentally supported use case of Sphinx-Gallery

For the unicode testing purposes There was the need of an example breaking it. Having Latex with raw strings and the \u from \uparrow was a good test.

larsoner · 2016-03-17T15:13:38Z

sphinx_gallery/gen_rst.py

@@ -505,7 +527,8 @@ def execute_script(code_block, example_globals, image_path, fig_count,
        fig_count += 1  # raise count to avoid overwriting image

        # Breaks build on first example error
-
+        # XXX This check can break during testing e.g. if you uncomment the
+        # `raise RuntimeError` by the `my_stdout` call, maybe use `.get()`?


@lesteve here you go

Thanks for this, we should probably fix it at one point.

larsoner · 2016-03-17T15:14:06Z

Changed the class and got rid of the .get() with a comment that should help the next dev take a look if they want

lesteve · 2016-03-17T15:30:53Z

OK LGTM, merging.

lesteve · 2016-03-17T15:31:06Z

Thanks a lot for the fix!

[MRG+1]: Allow unicode in code and outputs

larsoner · 2016-03-17T15:39:20Z

Thanks for the quick reviews

agramfort · 2016-03-17T16:00:31Z

🍻 !

larsoner mentioned this pull request Mar 14, 2016

[MRG][FIX] Fixes to examples mne-tools/mne-python#2990

Merged

larsoner changed the title ~~FIX: Allow unicode in code and outputs~~ WIP: Allow unicode in code and outputs Mar 14, 2016

Titan-C reviewed Mar 14, 2016
View reviewed changes

larsoner force-pushed the unicode branch from f1a98fb to e05b200 Compare March 14, 2016 15:41

larsoner changed the title ~~WIP: Allow unicode in code and outputs~~ MRG: Allow unicode in code and outputs Mar 14, 2016

Titan-C mentioned this pull request Mar 15, 2016

Example that fail unicode larsoner/sphinx-gallery#1

Merged

larsoner reviewed Mar 16, 2016
View reviewed changes

Titan-C reviewed Mar 16, 2016
View reviewed changes

Titan-C changed the title ~~MRG: Allow unicode in code and outputs~~ [MRG+1]: Allow unicode in code and outputs Mar 17, 2016

lesteve reviewed Mar 17, 2016
View reviewed changes

larsoner reviewed Mar 17, 2016
View reviewed changes

larsoner mentioned this pull request Mar 17, 2016

MRG: Fix exception #107

Merged

lesteve reviewed Mar 17, 2016
View reviewed changes

larsoner and others added 11 commits March 17, 2016 11:08

FIX: Allow unicode in code and outputs

1a6ac2e

FIX: Fix docstring unicode

a910e55

FIX: Better logging handling

7952f21

FIX: Minor fixes

2b53be7

Quantum mechanics example that fail unicode

7aa20a8

FIX: Fix unicode

78793ca

FIX: Fix comment

4852d90

More content relevant quantum example

e8d9fed

For the unicode testing purposes There was the need of an example breaking it. Having Latex with raw strings and the \u from \uparrow was a good test.

FIX: Address comments

33c1893

FIX: Minor fixes

81bd239

larsoner force-pushed the unicode branch from 2410e90 to 81bd239 Compare March 17, 2016 15:13

larsoner reviewed Mar 17, 2016
View reviewed changes

lesteve added a commit that referenced this pull request Mar 17, 2016

Merge pull request #106 from Eric89GXL/unicode

e2aaf4c

[MRG+1]: Allow unicode in code and outputs

lesteve merged commit e2aaf4c into sphinx-gallery:master Mar 17, 2016

larsoner deleted the unicode branch March 17, 2016 15:39

		@@ -93,6 +97,18 @@ def flush(self):
		self.file2.flush()


		class MyBytesIO(BytesIO):

[MRG+1]: Allow unicode in code and outputs #106

[MRG+1]: Allow unicode in code and outputs #106

Uh oh!

Conversation

larsoner commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner Mar 14, 2016 via email

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Titan-C commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016 via email

Uh oh!

Titan-C commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016

Uh oh!

agramfort commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016

Uh oh!

larsoner commented Mar 14, 2016

Uh oh!

larsoner commented Mar 16, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agramfort commented Mar 16, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner commented Mar 16, 2016

Uh oh!

Titan-C commented Mar 17, 2016

Uh oh!

Titan-C commented Mar 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lesteve commented Mar 17, 2016

Uh oh!

larsoner commented Mar 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

larsoner commented Mar 17, 2016

Uh oh!

larsoner commented Mar 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment