I would like to know if any "official" function/library existed in Python for IMAP4 UTF-7 folder path encoding.
In the imapInstance.list()
I get the following path IMAP UTF-7 encoded :
'(\\HasNoChildren) "." "[Mails].Test&AOk-"',
If I do the following encoding :
(u"[Mails].Testé").encode('utf-7')
I get :
'[Mails].Test+AOk-'
Which is UTF-7 but not IMAP UTF-7 encoded. Test+AOk-
instead of Test&AOk-
I'd need an official function or library to get the IMAP UTF-7 encoded version.
I wrote a very simple IMAP UTF7 python 3 implementation which follows the specification, and it seems to work. ("foo\rbar\n\n\n\r\r" and many other roundtrips, '&BdAF6QXkBdQ-', 'Test&Co', "[Mails].Test&AOk-" and '~peter/mail/&ZeVnLIqe-/&U,BTFw-' behave as expected).
#works with python 3
import base64
def b64padanddecode(b):
"""Decode unpadded base64 data"""
b+=(-len(b)%4)*'=' #base64 padding (if adds '===', no valid padding anyway)
return base64.b64decode(b,altchars='+,',validate=True).decode('utf-16-be')
def imaputf7decode(s):
"""Decode a string encoded according to RFC2060 aka IMAP UTF7.
Minimal validation of input, only works with trusted data"""
lst=s.split('&')
out=lst[0]
for e in lst[1:]:
u,a=e.split('-',1) #u: utf16 between & and 1st -, a: ASCII chars folowing it
if u=='' : out+='&'
else: out+=b64padanddecode(u)
out+=a
return out
def imaputf7encode(s):
""""Encode a string into RFC2060 aka IMAP UTF7"""
s=s.replace('&','&-')
iters=iter(s)
unipart=out=''
for c in s:
if 0x20<=ord(c)<=0x7f :
if unipart!='' :
out+='&'+base64.b64encode(unipart.encode('utf-16-be')).decode('ascii').rstrip('=')+'-'
unipart=''
out+=c
else : unipart+=c
if unipart!='' :
out+='&'+base64.b64encode(unipart.encode('utf-16-be')).decode('ascii').rstrip('=')+'-'
return out
Given the simplicity of this code, I set it in the public domain, so feel free to use it as you want.
The IMAPClient package has functionality for encoding and decoding using IMAP's modified UTF-7. Have a look in the IMAPClient.imap_utf7 module. This module could be used standalone or you could just use IMAPClient which handles doing the encoding and decoding of folder names transparently.
The project's home page is: https://github.com/mjs/imapclient
Example code:
from imapclient import imap_utf7
decoded = imap_utf7.decode(b'&BdAF6QXkBdQ-')
Disclaimer: I'm the original author of the IMAPClient package.
The imapclient implementation is kind of broken though:
x = "foo\rbar\n\n\n\r\r"
imap_utf7.decode(imap_utf7.encode(x))
Result:
>> 'foo&bar\n\n\r-'
Edit:
After some research I found an implementation in MailPile which does not fail at roundtrip encoding on this test. I also ported it to Python3 if you're interested: https://github.com/MarechJ/py3_imap_utf7
You may use imap_tools package: https://pypi.org/project/imap-tools/
from imap_tools.imap_utf7 import encode, decode
print(encode('привет'))
>>> b'&BD8EQAQ4BDIENQRC-'
print(decode(b'&BD8EQAQ4BDIENQRC-'))
>>> привет
print(repr(decode(encode("foo\rbar\n\n\n\r\r"))))
'foo\rbar\n\n\n\r\r'
*I am lib author
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With