I've been looking for an API to automatically retrieve Google Insights information for part of another algorithm, but have been unable to find anything. The first result on Google delivers a site with a python plugin which is now out of date.
Does such an API exist, or has anyone written a plugin, perhaps for python?
I just started searching for it and found a good way to retrieve it using python in the following script.Basically it is passing specialized quote to google historical financial database.
def get_index(gindex, startdate=20040101):
"""
API wrapper for Google Domestic Trends data.
https://www.google.com/finance/domestic_trends
Available Indices:
'ADVERT', 'AIRTVL', 'AUTOBY', 'AUTOFI', 'AUTO', 'BIZIND', 'BNKRPT',
'COMLND', 'COMPUT', 'CONSTR', 'CRCARD', 'DURBLE', 'EDUCAT', 'INVEST',
'FINPLN', 'FURNTR', 'INSUR', 'JOBS', 'LUXURY', 'MOBILE', 'MTGE',
'RLEST', 'RENTAL', 'SHOP', 'TRAVEL', 'UNEMPL'
"""
base_url = 'http://www.google.com/finance/historical?q=GOOGLEINDEX_US:'
full_url = '%s%s&output=csv&startdate=%s' % (base_url, gindex, startdate)
dframe = read_csv(urlopen(full_url), index_col=0)
dframe.index = DatetimeIndex(dframe.index)
dframe = dframe.sort_index(0)
for col in dframe.columns:
if len(dframe[col].unique()) == 1:
dframe.pop(col)
if len(dframe.columns) == 1 and dframe.columns[0] == 'Close':
dframe.columns = [gindex]
return dframe[gindex]
As far as I can tell, there is no API available as of yet, and neither is there a working implementation of a method for extracting data from Google Insights. However, I have found a solution to my (slightly more specific) problem, which could really just be solved by knowing how many times certain terms are searched for.
This can be done by interfacing with the Google Suggest protocol for webbrowser search bars. When you give it a word, it returns a list of suggested phrases as well as the number of times each phase has been searched (I'm not sure about the time unit, presumably in the last year).
Here is some python code for doing this, slightly adapted from code by odewahn1 at O'reilly Answers and working on Python 2.6 and lower:
from sgmllib import SGMLParser
import urllib2
import urllib
# Define the class that will parse the suggestion XML
class PullSuggestions(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.suggestions = []
self.queries = []
def start_suggestion(self, attrs):
for a in attrs:
if a[0] == 'data': self.suggestions.append(a[1])
def start_num_queries(self, attrs):
for a in attrs:
if a[0] == 'int': self.queries.append(a[1])
# ENTER THE BASE QUERY HERE
base_query = "" #This is the base query
base_query += "%s"
alphabet = "abcdefghijklmnopqrstuvwxyz"
for letter in alphabet:
q = base_query % letter;
query = urllib.urlencode({'q' : q})
url = "http://google.com/complete/search?output=toolbar&%s" % query
res = urllib2.urlopen(url)
parser = PullSuggestions()
parser.feed(res.read())
parser.close()
for i in range(0,len(parser.suggestions)):
print "%s\t%s" % (parser.suggestions[i], parser.queries[i])
This at least solves the problem in part, but unfortunately it is still difficult to reliably obtain the number of searches for any specific word or phrase and impossible to obtain the search history of different phrases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With