Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse Gmail with Python and mark all older than date as "read"

Long story short, I created a new gmail account, and linked several other accounts to it (each with 1000s of messages), which I am importing. All imported messages arrive as unread, but I need them to appear as read.

I have a little experience with python, but I've only used mail and imaplib modules for sending mail, not processing accounts.

Is there a way to bulk process all items in an inbox, and simply mark messages older than a specified date as read?

like image 220
Eric Avatar asked Aug 18 '09 20:08

Eric


3 Answers

typ, data = M.search(None, '(BEFORE 01-Jan-2009)')
for num in data[0].split():
   M.store(num, '+FLAGS', '\\Seen')

This is a slight modification of the code in the imaplib doc page for the store method. I found the search criteria to use from RFC 3501. This should get you started.

like image 111
Philip Tinney Avatar answered Nov 15 '22 05:11

Philip Tinney


Based on Philip T.'s answer above and RFC 3501 and RFC 2822, I built some lines of code to mark mails older than 10 days as read. A static list is used for the abbreviated month names. This is not particularly elegant, but Python's %b format string is locale dependent, which could give unpleasant surprises. All IMAP commands are UID based.

import imaplib, datetime

myAccount = imaplib.IMAP4(<imapserver>)
myAccount.login(<imapuser>, <password>)
myAccount.select(<mailbox>)

monthListRfc2822 = ['0', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                    'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
beforeDate = datetime.datetime.today() - datetime.timedelta(days = 10)
beforeDateString = ("(BEFORE %s-%s-%s)"
                    % (beforeDate.strftime('%d'),
                       monthListRfc2822[beforeDate.month],
                       beforeDate.strftime('%Y')))
typ, data = myAccount.uid('SEARCH', beforeDateString)
for uid in data[0].split():
    myAccount.uid('STORE', uid, '+FLAGS', '(\Seen)')

By the way: I do not know, why "-" had to be used as a date delimiter in the search string in my case (dovecot IMAP server). To me that seems to contradict RFC 2822. However, dates with simple whitespace as delimiter only returned IMAP errors.

like image 39
pygrac Avatar answered Nov 15 '22 03:11

pygrac


Rather than try to parse our HTML why not just use the IMAP interface? Hook it up to a standard mail client and then just sort by date and mark whichever ones you want as read.

like image 1
Marplesoft Avatar answered Nov 15 '22 04:11

Marplesoft