Thanks to visit codestin.com
Credit goes to github.com

Skip to content

please make get_body_text more robust #1601

@josch

Description

@josch

Hi,

especially when encountering malformed spam email, alot keeps quitting on me with tracebacks like this:

  File "/usr/share/alot/alot/widgets/search.py", line 187, in <genexpr>
    lastcontent = ' '.join(m.get_body_text() for m in msgs)
  File "/usr/share/alot/alot/db/message.py", line 287, in get_body_text
    return extract_body_part(self.get_mime_part())
  File "/usr/share/alot/alot/db/utils.py", line 497, in extract_body_part
    rendered_payload = render_part(
  File "/usr/share/alot/alot/db/utils.py", line 345, in render_part
    raw_payload = remove_cte(part)
  File "/usr/share/alot/alot/db/utils.py", line 440, in remove_cte
    bp = base64.b64decode(payload)
  File "/usr/lib/python3.9/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

or

  File "/usr/share/alot/alot/widgets/search.py", line 187, in <genexpr>
    lastcontent = ' '.join(m.get_body_text() for m in msgs)
  File "/usr/share/alot/alot/db/message.py", line 287, in get_body_text
    return extract_body_part(self.get_mime_part())
  File "/usr/share/alot/alot/db/utils.py", line 497, in extract_body_part
    rendered_payload = render_part(
  File "/usr/share/alot/alot/db/utils.py", line 345, in render_part
    raw_payload = remove_cte(part)
  File "/usr/share/alot/alot/db/utils.py", line 436, in remove_cte
    bp = quopri.decodestring(payload.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 8114-8123: ordinal not in range(128)

I'm currently running alot with the following patch:

--- a/alot/db/message.py	2022-04-21 14:03:34.085067550 +0200
+++ b/alot/db/message.py	2022-04-21 12:17:26.415798127 +0200
@@ -284,7 +284,10 @@
 
     def get_body_text(self):
         """ returns bodystring extracted from this mail """
-        return extract_body_part(self.get_mime_part())
+        try:
+            return extract_body_part(self.get_mime_part())
+        except:
+            return "ERROR"
 
     def matches(self, querystring):
         """tests if this messages is in the resultset for `querystring`"""

This replaces the message body by ERROR which is fine because those messages are spam anyways and at least alot doesn't quit. If a messages makes alot quit, it's quite time consuming to find that one spam message that tripped it off. With this patch such messages can be quickly identified and marked as spam. Certainly something more descriptive than ERROR should be returned, maybe even a traceback that helps identifying the problem?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions