Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading emails with imaplib - "Got more than 10000 bytes" error

I'm trying to connect to my gmail account with imaplib:

import imaplib
mail = imaplib.IMAP4_SSH('imap.gmail.com')
mail.login('[email protected]', 'mypassword')
mail.select("inbox")
# returns ('OK', [b'12009'])

This all seems to work nicely, however:

mail.search(None, "ALL")
# returns error: command: SEARCH => got more than 10000 bytes
mail.logout()
# returns ('NO',
# ["<class 'imaplib.IMAP4.error'>: command: LOGOUT => got more than 10000 bytes"])

The account I'm trying to access has about 9,000 emails in the the inbox. I tried the above with another account which has less than 1,000 and the code works fine.

Is the issue with the first email account related to the number of mails in it? Is there some default setting that implements some size limit?

How can I get around the error and read my emails?

like image 891
pandita Avatar asked Aug 23 '14 00:08

pandita


1 Answers

Is the issue with the first email account related to the number of mails in it?

Not directly, but yeah, pretty much. The issue is with the fact that you're trying to download the whole list of 9000 messages at once.

Sending ridiculously long lines has been a useful DoS attack and, for programs implemented in C rather than Python, buffer overflow attack against many network clients and servers. It can also be very slow, and choke the network. But notice that the RFC was last updated in 1999, and imaplib was written in 1997, so the limits of "ridiculous" may have changed since then.

The right way to solve this, according to RFC 2683, is to not try to do that. (See especially section 3.2.1.5.)


Is there some default setting that implements some size limit?

Yes. It's not listed in the docs, but since the RFC recommends a limit of 8000 bytes, and it's allowing 10000, I guess that's reasonable.


How can I get around the error and read my emails?

Again, what you should do is break this up into smaller reads.

But as long gmail has no problem with a search this big, and you're happy to require a computer and network connection a little better than late-90s-standard, you can probably get away with getting around the problem instead.

Fortunately, like many of the modules in the stdlib, imaplib is written as much to be useful sample code as to be used as a module. You can always tell this is the case because the documentation links to the source right at the top.

So, if you take a look, you'll see, not far from the top:

# reading arbitrary length lines. RFC 3501 and 2060 (IMAP 4rev1)
# don't specify a line length. RFC 2683 however suggests limiting client
# command lines to 1000 octets and server command lines to 8000 octets.
# We have selected 10000 for some extra margin and since that is supposedly
# also what UW and Panda IMAP does.
_MAXLINE = 10000

So, if you want to override this, you could fork the module (save imaplib.py as myimaplib.py and use that instead), or you could just monkeypatch it at runtime:

import imaplib
imaplib._MAXLINE = 40000

Of course you'll have to pick a number that you think better reflects the edge of ridiculousness in 2014.

like image 74
abarnert Avatar answered Oct 09 '22 09:10

abarnert