Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does search in gmail API return different result than search in gmail website?

I'm using the gmail API to search emails from users. I've created the following search query:

ticket after:2015/11/04 AND -from:me AND -in:trash

When I run this query in the browser interface of Gmail I get 11 messages (as expected). When I run the same query in the API however, I get only 10 messages. The code I use to query the gmail API is written in Python and looks like this:

searchQuery = 'ticket after:2015/11/04 AND -from:me AND -in:trash'
messagesObj = google.get('/gmail/v1/users/me/messages', data={'q': searchQuery}, token=token).data
print messagesObj.resultSizeEstimate  # 10

I sent the same message on to another gmail address and tested it from that email address and (to my surprise) it does show up in an API-search with that other email address, so the trouble is not the email itself.

After endlessly emailing around through various test-gmail accounts I *think (but not 100% sure) that the browser-interface search function has a different definition of "me". It seems that in the API-search it does not include emails which come from email addresses with the same name while these results are in fact included in the result of the browser-search. For example: if "Pete Kramer" sends an email from [email protected] to [email protected] (which both have their name set to "Pete Kramer") it will show in the browser-search and it will NOT show in the API-search.

Can anybody confirm that this is the problem? And if so, is there a way to circumvent this to get the same results as the browser-search returns? Or does anybody else know why the results from the gmail browser-search differ from the gmail API-search? Al tips are welcome!

like image 238
kramer65 Avatar asked Nov 05 '15 19:11

kramer65


3 Answers

I would suspect it is the after query parameter that is giving you trouble. 2015/11/04 is not a valid ES5 ISO 8601 date. You could try the alternative after:<time_in_seconds_since_epoch>

# 2015-11-04 <=> 1446595200

searchQuery = 'ticket AND after:1446595200 AND -from:me AND -in:trash'
messagesObj = google.get('/gmail/v1/users/me/messages', data={'q': searchQuery}, token=token).data
print messagesObj.resultSizeEstimate  # 11 hopefully!
like image 138
Tholle Avatar answered Nov 18 '22 21:11

Tholle


The q parameter of the /messages/list works the same as on the web UI for me (tried on https://developers.google.com/gmail/api/v1/reference/users/messages/list#try-it )

I think the problem is that you are calling /messages rather than /messages/list

like image 45
rds Avatar answered Nov 18 '22 19:11

rds


The first time your application connects to Gmail, or if partial synchronization is not available, you must perform a full sync. In a full sync operation, your application should retrieve and store as many of the most recent messages or threads as are necessary for your purpose. For example, if your application displays a list of recent messages, you may wish to retrieve and cache enough messages to allow for a responsive interface if the user scrolls beyond the first several messages displayed. The general procedure for performing a full sync operation is as follows:

  1. Call messages.list to retrieve the first page of message IDs.

  2. Create a batch request of messages.get requests for each of the messages returned by the list request. If your application displays message contents, you should use format=FULL or format=RAW the first time your application retrieves a message and cache the results to avoid additional retrieval operations. If you are retrieving a previously cached message, you should use format=MINIMAL to reduce the size of the response as only the labelIds may change.

  3. Merge the updates into your cached results. Your application should store the historyId of the most recent message (the first message in the list response) for future partial synchronization.

Note: You can also perform synchronization using the equivalent Threads resource methods. This may be advantageous if your application primarily works with threads or only requires message metadata.

Partial synchronization

If your application has synchronized recently, you can perform a partial sync using the history.list method to return all history records newer than the startHistoryId you specify in your request. History records provide message IDs and type of change for each message, such as message added, deleted, or labels modified since the time of the startHistoryId. You can obtain and store the historyId of the most recent message from a full or partial sync to provide as a startHistoryId for future partial synchronization operations.

Limitations

History records are typically available for at least one week and often longer. However, the time period for which records are available may be significantly less and records may sometimes be unavailable in rare cases. If the startHistoryId supplied by your client is outside the available range of history records, the API returns an HTTP 404 error response. In this case, your client must perform a full sync as described in the previous section.

From gmail API Documentation https://developers.google.com/gmail/api/guides/sync

like image 1
anubhav Avatar answered Nov 18 '22 21:11

anubhav