Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Handle dvi font names as ASCII bytestrings #6977

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Feb 26, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
705b021
Handle dvi font names as ASCII bytestrings
jkseppan Aug 25, 2016
dbc8b9e
Test that the KeyError is raised when the font is missing
jkseppan Aug 25, 2016
93fad55
Mention bytestrings in docstring
jkseppan Aug 25, 2016
4874e4e
Add a helpful note when raising KeyError from dviread.PsFonts
jkseppan Aug 25, 2016
a130ba7
Attempted fix for Python 3.4 compatibility
jkseppan Aug 25, 2016
0f0e41a
More python 3.4 compatibility
jkseppan Aug 26, 2016
a7b5772
Use numpydoc format for several dviread docstrings
jkseppan Dec 27, 2016
803a96e
Remove useless docstring
jkseppan Dec 27, 2016
ec5d80e
Raise a more useful exception
jkseppan Dec 27, 2016
fe52808
Remove misleading parentheses from assert
jkseppan Dec 27, 2016
aa8c4f6
Simplify parsing with regular expressions
jkseppan Dec 27, 2016
9de07aa
Perhaps simplify further with regular expressions
jkseppan Dec 27, 2016
c87b653
Remove useless assert
jkseppan Dec 29, 2016
2e19a61
Fix dvi font name handling in pdf backend
jkseppan Dec 31, 2016
119934a
Separate the handling of dvi fonts in the pdf backend
jkseppan Jan 1, 2017
8fa303f
Simplify enc file parsing
jkseppan Jan 2, 2017
94587b1
Small changes in response to code review
jkseppan Jan 3, 2017
254e3df
Simplify psfonts.map parsing further
jkseppan Jan 3, 2017
a8674b3
Try to fix the KeyError test
jkseppan Jan 29, 2017
25a8fed
ENH: make texFontMap a property
tacaswell Feb 11, 2017
92e2c52
Merge pull request #6 from tacaswell/dvi-ascii
jkseppan Feb 12, 2017
5ba21b0
Use file system encoding for the psfonts file name
jkseppan Feb 12, 2017
10135bf
Document minor API changes
jkseppan Feb 12, 2017
6de9813
Explain named group ordering
jkseppan Feb 12, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Small changes in response to code review
Improve a docstring, remove unneeded parens from an assert,
open a file as binary instead of encoding each line read from it,
don't call six.b on variable strings, simplify string handling,
improve the formatting of a matplotlib.verbose.report call.
  • Loading branch information
jkseppan committed Jan 29, 2017
commit 94587b1b8ea7c93f468675efba2c1c8e5d7709d1
54 changes: 24 additions & 30 deletions lib/matplotlib/dviread.py
Original file line number Diff line number Diff line change
Expand Up @@ -747,14 +747,10 @@ class Tfm(object):
Used for verifying against the dvi file.
design_size : int
Design size of the font (unknown units)
width : dict
Width of each character, needs to be scaled by the factor
specified in the dvi file. This is a dict because indexing may
width, height, depth : dict
Dimensions of each character, need to be scaled by the factor
specified in the dvi file. These are dicts because indexing may
not start from 0.
height : dict
Height of each character.
depth : dict
Depth of each character.
"""
__slots__ = ('checksum', 'design_size', 'width', 'height', 'depth')

Expand Down Expand Up @@ -844,25 +840,25 @@ def __init__(self, filename):
self._filename = filename
if six.PY3 and isinstance(filename, bytes):
self._filename = filename.decode('ascii', errors='replace')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ascii instead of utf-8 or the system encoding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the system encoding is more correct, but conversions like that make me somewhat wary. It's not really enough to specify UTF-8, you have to know which representation to choose for characters where you have a choice. (For example, the Wikipedia page on HFS+: "File and folder names in HFS Plus are [...] normalized to a form very nearly the same as Unicode Normalization Form D (NFD)". At least at one time the Linux HFS+ implementation didn't follow this.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct encoding depends on where the bytestring originates. If it's out of a TeX file, I wouldn't be surprised if ASCII were good enough considering the esoteric requirements like fitting in 8 characters.

If it's something the user supplies, then there's really no good default and they really should have done it themselves. If the "user" is us, then we really need to fix that end instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only current users in our code are the PDF backend and the text2path code, both of which just pass in the location of "pdftex.map". I initially thought this might need to be made customizable but I've never seen that as a feature request.

with open(filename, 'rt') as file:
with open(filename, 'rb') as file:
self._parse(file)

def __getitem__(self, texname):
assert(isinstance(texname, bytes))
assert isinstance(texname, bytes)
try:
result = self._font[texname]
except KeyError:
matplotlib.verbose.report(textwrap.fill
('A PostScript file for the font whose TeX name is "%s" '
'could not be found in the file "%s". The dviread module '
'can only handle fonts that have an associated PostScript '
'font file. '
'This problem can often be solved by installing '
'a suitable PostScript font package in your (TeX) '
'package manager.' % (texname.decode('ascii'),
self._filename),
break_on_hyphens=False, break_long_words=False),
'helpful')
fmt = ('A PostScript file for the font whose TeX name is "{0}" '
'could not be found in the file "{1}". The dviread module '
'can only handle fonts that have an associated PostScript '
'font file. '
'This problem can often be solved by installing '
'a suitable PostScript font package in your (TeX) '
'package manager.')
msg = fmt.format(texname.decode('ascii'), self._filename)
msg = textwrap.fill(msg, break_on_hyphens=False,
break_long_words=False)
matplotlib.verbose.report(msg, 'helpful')
raise
fn, enc = result.filename, result.encoding
if fn is not None and not fn.startswith(b'/'):
Expand All @@ -873,7 +869,6 @@ def __getitem__(self, texname):

def _parse(self, file):
for line in file:
line = six.b(line)
line = line.strip()
if line == b'' or line.startswith(b'%'):
continue
Expand Down Expand Up @@ -979,21 +974,20 @@ def __iter__(self):
def _parse(self, file):
result = []

lines = (line[:line.find(b'%')] if b'%' in line else line.strip()
for line in file)
lines = (line.split(b'%', 1)[0].strip() for line in file)
data = b''.join(lines)
match = re.search(six.b(r'\['), data)
if not match:
beginning = data.find(b'[')
if beginning < 0:
raise ValueError("Cannot locate beginning of encoding in {}"
.format(file))
data = data[match.span()[1]:]
match = re.search(six.b(r'\]'), data)
if not match:
data = data[beginning:]
end = data.find(b']')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an rfind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind 🐑

if end < 0:
raise ValueError("Cannot locate end of encoding in {}"
.format(file))
data = data[:match.span()[0]]
data = data[:end]

return re.findall(six.b(r'/([^][{}<>\s]+)'), data)
return re.findall(br'/([^][{}<>\s]+)', data)


def find_tex_file(filename, format=None):
Expand Down
2 changes: 1 addition & 1 deletion lib/matplotlib/tests/test_dviread.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def test_dviread():
with open(os.path.join(dir, 'test.json')) as f:
correct = json.load(f)
for entry in correct:
entry['text'] = [[a, b, c, six.b(d), e]
entry['text'] = [[a, b, c, d.encode('ascii'), e]
for [a, b, c, d, e] in entry['text']]
with dr.Dvi(os.path.join(dir, 'test.dvi'), None) as dvi:
data = [{'text': [[t.x, t.y,
Expand Down