Title Wrong header encoding handling in
Priority urgent Status resolved
Superseder Nosy List ezio.melotti
Assigned To ezio.melotti Topics

Created on 2018-10-10.00:38:44 by ezio.melotti, last changed 2018-10-10.01:08:25 by ezio.melotti.

msg3549 (view) Author: ezio.melotti Date: 2018-10-10.00:38:43
def _decode_header_to_utf8(self, hdr):
        l = []
        for part, encoding in decode_header(hdr):
            if encoding:
                part = part.decode(encoding)
        return ''.join([s.encode('utf-8') for s in l])

If the encoding is specified, l becomes a list of unicode strings that can be encoded in the listcomp, but if the encoding is not specified, l becomes a list of byte strings that can't be encoded if they contain non-ascii characters.
The latter causes lot of decoding errors that gets reported to the admins due to all the spam messages (apparently with no encoding specified) that get sent to b.p.o.

I'm going to fix this by attempting the decoding of the part using utf-8 and falling back to iso-8859-1 in case of error.  This will ensure that l is a list of unicode strings that can be encoded.  This will also stop the decoding errors in the listcomp, and let the spam messages through, hopefully to be blocked shortly after when Roundup figures out the user is not registered.
msg3550 (view) Author: ezio.melotti Date: 2018-10-10.00:55:02
Fixed in
msg3551 (view) Author: ezio.melotti Date: 2018-10-10.01:08:25
Reported upstream at
Date User Action Args
2018-10-10 01:08:25ezio.melottisetmessages: + msg3551
2018-10-10 00:55:02ezio.melottisetstatus: in-progress -> resolved
messages: + msg3550
2018-10-10 00:38:44ezio.melotticreate