Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 861706f

Browse files
committed
fix for bug reported by ToR (unknown charset 'utf-8, text/html')
1 parent c7c84c3 commit 861706f

2 files changed

Lines changed: 7 additions & 3 deletions

File tree

doc/THANKS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -422,6 +422,9 @@ Stuffe <[email protected]>
422422
423423
for suggesting some features
424424

425+
426+
for reporting a minor bug
427+
425428
== Organizations ==
426429

427430
Black Hat team <[email protected]>

lib/request/basic.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,9 @@ def checkCharEncoding(encoding):
8181
#http://www.destructor.de/charsets/index.htm
8282
translate = { 'windows-874':'iso-8859-11' }
8383

84-
if ';' in encoding:
85-
encoding = encoding[:encoding.find(';')]
84+
for delimiter in (';', ','):
85+
if delimiter in encoding:
86+
encoding = encoding[:encoding.find(delimiter)]
8687

8788
# http://philip.html5.org/data/charsets-2.html
8889
if encoding in translate:
@@ -97,9 +98,9 @@ def checkCharEncoding(encoding):
9798
except LookupError:
9899
warnMsg = "unknown charset '%s'. " % encoding
99100
warnMsg += "Please report by e-mail to [email protected]."
100-
101101
logger.warn(warnMsg)
102102
encoding = conf.dataEncoding
103+
103104
return encoding
104105

105106
def decodePage(page, contentEncoding, contentType):

0 commit comments

Comments
 (0)