Allow different encodings for reading JCAMP files#101
Conversation
| # JCAMP-DX functions | ||
|
|
||
| def read_jcamp(filename): | ||
| def read_jcamp(filename, encodings=[locale.getpreferredencoding()]): |
There was a problem hiding this comment.
I think it would be preferable to use a single encoding here. If users want to try multiple encoding they can create a loop over read_jcamp with a check for a UnicodeDecodeError.
| warn("Unable to correctly parse line:" + line) | ||
| for N, encoding in enumerate(encodings): | ||
| try: | ||
| with open(filename, 'r', encoding = encoding) as f: |
There was a problem hiding this comment.
In Python 2 encoding is not an argument in the buildin open command. io.open does support this argument.
|
@andreasdoll I made two suggestions on how I think this should be implemented. Let me know if you need any help with this. |
Allow to specify an encoding for reading JCAMP files. Keep the current behaviour by choosing the default encoding if no encoding is specified. Use io.open() to read a JCAMP file to guarantee backwards compatibility with Python 2.
|
Thanks @jjhelmus, I wasn't aware that encodings in open() is not backwards compatible. I've modified that commit according to your wishes and added the previously forgotten parameter docstring. |
|
LGTM, thanks @andreasdoll |
Allow to provide a list of possible encodings for reading JCAMP files,
and use the first encoding which doesn't raise an UnicodeDecodeError.
Suppress such errors, except when all provided encodings fail.
Keep the current behaviour by choosing the default encoding if no
encoding is specified.
I'm not sure if you want to pull this, but for my purposes this is easier and more futureproof than converting the JCAMP files on operating system level.