Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New Python Gmail API - Only Retrieve Messages from Yesterday

I've been updating some scripts to the new Python Gmail API. However, I am confused as how to update the following so that I only retrieve messages from yesterday. Can anyone show me how to do this?

The only way I can currently see is to loop through all messages and only parse those with epochs in the correct time range. However, that seems horribly inefficient if I have 1000's of messages. There must be a more efficient way to do this.

from apiclient import discovery
import oauth2client
from oauth2client import client
from oauth2client import tools
import os
import httplib2
import email
from apiclient.http import BatchHttpRequest
import base64
from bs4 import BeautifulSoup
import re
import datetime

try:
    import argparse
    flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
    flags = None

SCOPES = 'https://www.googleapis.com/auth/gmail.readonly'
CLIENT_SECRET_FILE = '/Users/sokser/Downloads/client_secret.json'
APPLICATION_NAME = 'Gmail API Python Quickstart'


def get_credentials():
    """Gets valid user credentials from storage.

    If nothing has been stored, or if the stored credentials are invalid,
    the OAuth2 flow is completed to obtain the new credentials.

    Returns:
        Credentials, the obtained credential.
    """
    home_dir = os.path.expanduser('~')
    credential_dir = os.path.join(home_dir, '.credentials')
    if not os.path.exists(credential_dir):
        os.makedirs(credential_dir)
    credential_path = os.path.join(credential_dir,
                                   'gmail-python-quickstart.json')

    store = oauth2client.file.Storage(credential_path)
    credentials = store.get()
    if not credentials or credentials.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        flow.user_agent = APPLICATION_NAME
        if flags:
            credentials = tools.run_flow(flow, store, flags)
        else: # Needed only for compatibility with Python 2.6
            credentials = tools.run(flow, store)
        print('Storing credentials to ' + credential_path)
    return credentials

def visible(element):
    if element.parent.name in ['style', 'script', '[document]', 'head', 'title']:
        return False
    elif re.match('<!--.*-->', str(element)):
        return False
    return True

def main():
    """Shows basic usage of the Gmail API.

    Creates a Gmail API service object and outputs a list of label names
    of the user's Gmail account.
    """
    credentials = get_credentials()
    http = credentials.authorize(httplib2.Http())
    service = discovery.build('gmail', 'v1', http=http)
    #Get yesterdays date and the epoch time
    yesterday = datetime.date.today() - datetime.timedelta(1)
    unix_time= int(yesterday.strftime("%s"))

    messages = []

    message = service.users().messages().list(userId='me').execute()
    for m in message['messages']:
        #service.users().messages().get(userId='me',id=m['id'],format='full')
        message = service.users().messages().get(userId='me',id=m['id'],format='raw').execute()
        epoch = int(message['internalDate'])/1000

        msg_str = str(base64.urlsafe_b64decode(message['raw'].encode('ASCII')),'utf-8')
        mime_msg = email.message_from_string(msg_str)
        #print(message['payload']['parts'][0]['parts'])
        #print()
        mytext = None
        for part in mime_msg.walk():
            mime_msg.get_payload()
            #print(part)
            #print()
            if part.get_content_type() == 'text/plain':
                soup = BeautifulSoup(part.get_payload(decode=True))
                texts = soup.findAll(text=True)
                visible_texts = filter(visible,texts)
                mytext = ". ".join(visible_texts)
            if part.get_content_type() == 'text/html' and not mytext:
                mytext = part.get_payload(decode=True)
        print(mytext)
        print()

if __name__ == '__main__':
    main()
like image 665
user2694306 Avatar asked Dec 29 '15 16:12

user2694306


People also ask

How do I connect to Gmail API in Python?

Step 1: Turn on the Gmail API. Step 2: Install the Google Client Library. Step 3: Set up the sample. Step 4: Run the sample. Notes. Troubleshooting. Further reading. Complete the steps described in the rest of this page to create a simple Python command-line application that makes requests to the Gmail API.

What is gmail@gmailapi?

gmailapi.py is a client front-end library script to make requests to the Gmail Google API. Gmail API relies upon OAuth 2.0 protocol for authentication and authorization. The description of the OAuth 2.0 authorization scenarios that Google supports can be found in Using OAuth 2.0 to Access Google APIs.

How to enable Gmail API in Google Cloud console?

To enable Gmail API you have to create a new Cloud Platform project in Google Cloud Console and enable the Gmail API in Google API Console. Then gmailapi.py script requests an OAuth 2.0 access token from the Google Authorization Server, extracts a token from the response, and sends the token to the Gmail API that you want to access.

How to retrieve the message ID of a message?

This ID is usually retrieved using messages.list. The ID is also contained in the result when a message is inserted ( messages.insert) or imported ( messages.import ). The format to return the message in.


1 Answers

You can pass queries to the messages.list method that searches for messages within a date range. You can actually use any query supported by Gmail's advanced search.

You do this, which will just return messages.

message = service.users().messages().list(userId='me').execute()

But can do this to search for messages sent yesterday, by passing the q keyword argument, and a query specifying the before: and after: keywords.

from datetime import date, timedelta

today = date.today()
yesterday = today - timedelta(1)

# do your setup...

user_id = 'user email address'

# Dates have to formatted in YYYY/MM/DD format for gmail
query = "before: {0} after: {1}".format(today.strftime('%Y/%m/%d'),
                                        yesterday.strftime('%Y/%m/%d'))

response = service.users().messages().list(userId=user_id,
                                           q=query).execute()
# Process the response for messages...

You can also try this against their GMail messages.list reference page.

like image 157
逆さま Avatar answered Oct 08 '22 11:10

逆さま