Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access Google Trends Data without a wrapper, or with the API: Python

I am trying to write a Python program to gather data from Google Trends (GT)- specifically, I want to automatically open URLs and access the specific values that are displayed in the line graphs:

enter image description here

I would be happy with downloading the CSV files, or with web-scraping the values (based on my reading of Inspect Element, cleaning the data would only require a simple split or two). I have many searches I want to conduct (many different keywords)

I am creating many URLs to gather data from Google Trends. I used the actual URL from a test search. Example of a URL: https://trends.google.com/trends/explore?q=sports%20cars&geo=US Physically searching this URL on a browser shows the relevant GT page. The problem comes when I try to access it through a program.

Most responses I have seen suggest using public modules from Pip (e.g. PyTrends and the "Unofficial Google Trends API")- my project manager has insisted I do not use modules that are not directly created by the site (i.e.: APIs are acceptable but only an official Google API). Only BeautifulSoup has been sanctioned as a plugin (don't ask why).

Below is an example of the code I have tried. I know it is basic, but on the very first request I got:

HTTPError: HTTP Error 429: unknown": too many requests.

Some responses to other questions mention Google Trends API - is this real? I could not find any documentation on an official API.

Here is another post which outlined a solution that I have tried that did not work for me:

https://codereview.stackexchange.com/questions/208277/web-scraping-google-trends-in-python

url = 'https://trends.google.com/trends/explore?q=sports%20cars&geo=US'

html = urlopen(url).read()

soup = bs(html, 'html.parser')

divs = soup.find_all('div')

return divs
like image 698
harry Avatar asked May 28 '19 11:05

harry


People also ask

Is there an API for Google Trends?

Google Trends is a public platform that you can use to analyze interest over time for a given topic, search term, and even company. Pytrends is an unofficial Google Trends API that provides different methods to download reports of trending results from google trends.

Can you scrape Google Trends?

Google Trends does not have an API, but Google Trends Scraper creates an unofficial Google Trends API to let you extract data from Google Trends directly and at scale. It is built on the powerful Apify SDK and you can run it on the Apify platform and locally.


1 Answers

It's using an API you can find in the network tab

import requests
import json

r = requests.get('https://trends.google.com/trends/api/widgetdata/multiline?hl=en-GB&tz=-60&req=%7B%22time%22:%222018-05-29+2019-05-29%22,%22resolution%22:%22WEEK%22,%22locale%22:%22en-GB%22,%22comparisonItem%22:%5B%7B%22geo%22:%7B%22country%22:%22US%22%7D,%22complexKeywordsRestriction%22:%7B%22keyword%22:%5B%7B%22type%22:%22BROAD%22,%22value%22:%22sports+cars%22%7D%5D%7D%7D%5D,%22requestOptions%22:%7B%22property%22:%22%22,%22backend%22:%22IZG%22,%22category%22:0%7D%7D&token=APP6_UEAAAAAXO-yaYekqJ7Tf2nuoLBAigMSW7axoLTL&tz=-60')
data = json.loads(r.text.lstrip(")]}\',\n"))

for item in data['default']['timelineData']:
    print(item['formattedAxisTime'], item['value'])

We can unquote the url to have a better idea of what is going on:

import urllib.parse

url = 'https://trends.google.com/trends/api/widgetdata/multiline?hl=en-GB&tz=-60&req=%7B%22time%22:%222018-05-29+2019-05-29%22,%22resolution%22:%22WEEK%22,%22locale%22:%22en-GB%22,%22comparisonItem%22:%5B%7B%22geo%22:%7B%22country%22:%22US%22%7D,%22complexKeywordsRestriction%22:%7B%22keyword%22:%5B%7B%22type%22:%22BROAD%22,%22value%22:%22sports+cars%22%7D%5D%7D%7D%5D,%22requestOptions%22:%7B%22property%22:%22%22,%22backend%22:%22IZG%22,%22category%22:0%7D%7D&token=APP6_UEAAAAAXO-yaYekqJ7Tf2nuoLBAigMSW7axoLTL&tz=-60'
print(urllib.parse.unquote(url))

This yields:

'https://trends.google.com/trends/api/widgetdata/multiline?hl=en-GB&tz=-60&req={"time":"2018-05-29+2019-05-29","resolution":"WEEK","locale":"en-GB","comparisonItem":[{"geo":{"country":"US"},"complexKeywordsRestriction":{"keyword":[{"type":"BROAD","value":"sports+cars"}]}}],"requestOptions":{"property":"","backend":"IZG","category":0}}&token=APP6_UEAAAAAXO-yaYekqJ7Tf2nuoLBAigMSW7axoLTL&tz=-60'

You'll need to explore how transferable elements from this are.

For example, I looked at search term banana and this was the result:

unquoted:

'https://trends.google.com/trends/api/explore?hl=en-GB&tz=-60&req={"comparisonItem":[{"keyword":"banana","geo":"US","time":"today+12-m"}],"category":0,"property":""}&tz=-60'

quoted:

'https://trends.google.com/trends/api/explore?hl=en-GB&tz=-60&req=%7B%22comparisonItem%22:%5B%7B%22keyword%22:%22banana%22,%22geo%22:%22US%22,%22time%22:%22today+12-m%22%7D%5D,%22category%22:0,%22property%22:%22%22%7D&tz=-60'
like image 171
QHarr Avatar answered Sep 26 '22 00:09

QHarr