i'm having a very tough time searching google image search with python. I need to do it using only standard python libraries (so urllib, urllib2, json, ..) Can somebody please help? Assume the image is jpeg.jpg and is in same folder I'm running python from. I've tried a hundred different code versions, using headers, user-agent, base64 encoding, different urls (images.google.com, http://images.google.com/searchbyimage?hl=en&biw=1060&bih=766&gbv=2&site=search&image_url={{URL To your image}}&sa=X&ei=H6RaTtb5JcTeiALlmPi2CQ&ved=0CDsQ9Q8, etc....) Nothing works, it's always an error, 404, 401 or broken pipe :( Please show me some python script that will actually seach google images with my own image as the search data ('jpeg.jpg' stored on my computer/device) Thank you for whomever can solve this, Dave:)

The Google Image Search API is deprecated, we use google search to download the images using REgex and Beautiful soup <pre class="prettyprint"><code>from bs4 import BeautifulSoup import requests import re import urllib2 import os def get_soup(url,header): return BeautifulSoup(urllib2.urlopen(urllib2.Request(url,headers=header))) image_type = "Action" # you can change the query for the image here query = "Terminator 3 Movie" query= query.split() query='+'.join(query) url="https://www.google.co.in/searches_sm=122&source=lnms&tbm=isch&sa=X&ei=4r_cVID3NYayoQTb4ICQBA&ved=0CAgQ_AUoAQ&biw=1242&bih=619&q="+query print url header = {'User-Agent': 'Mozilla/5.0'} soup = get_soup(url,header) images = [a['src'] for a in soup.find_all("img", {"src": re.compile("gstatic.com")})] #print images for img in images: raw_img = urllib2.urlopen(img).read() #add the directory for your image here DIR="C:\Users\hp\Pictures\\valentines\\" cntr = len([i for i in os.listdir(DIR) if image_type in i]) + 1 print cntr f = open(DIR + image_type + "_"+ str(cntr)+".jpg", 'wb') f.write(raw_img) f.close() </code></pre>

python search with image google images

Tags:

python

search

image

i'm having a very tough time searching google image search with python. I need to do it using only standard python libraries (so urllib, urllib2, json, ..)

Can somebody please help? Assume the image is jpeg.jpg and is in same folder I'm running python from.

I've tried a hundred different code versions, using headers, user-agent, base64 encoding, different urls (images.google.com, http://images.google.com/searchbyimage?hl=en&biw=1060&bih=766&gbv=2&site=search&image_url={{URL To your image}}&sa=X&ei=H6RaTtb5JcTeiALlmPi2CQ&ved=0CDsQ9Q8, etc....)

Nothing works, it's always an error, 404, 401 or broken pipe :(

Please show me some python script that will actually seach google images with my own image as the search data ('jpeg.jpg' stored on my computer/device)

Thank you for whomever can solve this,

Dave:)

348

asked Jun 28 '12 10:06

user1488252

2 Answers

I use the following code in Python to search for Google images and download the images to my computer:

import os
import sys
import time
from urllib import FancyURLopener
import urllib2
import simplejson

# Define search term
searchTerm = "hello world"

# Replace spaces ' ' in search term for '%20' in order to comply with request
searchTerm = searchTerm.replace(' ','%20')


# Start FancyURLopener with defined version 
class MyOpener(FancyURLopener): 
    version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
myopener = MyOpener()

# Set count to 0
count= 0

for i in range(0,10):
    # Notice that the start changes for each iteration in order to request a new set of images for each loop
    url = ('https://ajax.googleapis.com/ajax/services/search/images?' + 'v=1.0&q='+searchTerm+'&start='+str(i*4)+'&userip=MyIP')
    print url
    request = urllib2.Request(url, None, {'Referer': 'testing'})
    response = urllib2.urlopen(request)

    # Get results using JSON
    results = simplejson.load(response)
    data = results['responseData']
    dataInfo = data['results']

    # Iterate for each result and get unescaped url
    for myUrl in dataInfo:
        count = count + 1
        print myUrl['unescapedUrl']

        myopener.retrieve(myUrl['unescapedUrl'],str(count)+'.jpg')

    # Sleep for one second to prevent IP blocking from Google
    time.sleep(1)

You can also find very useful information here.

183

answered Oct 12 '22 11:10

Jaime Ivan Cervantes

The Google Image Search API is deprecated, we use google search to download the images using REgex and Beautiful soup

from bs4 import BeautifulSoup
import requests
import re
import urllib2
import os


def get_soup(url,header):
  return BeautifulSoup(urllib2.urlopen(urllib2.Request(url,headers=header)))

image_type = "Action"
# you can change the query for the image  here  
query = "Terminator 3 Movie"
query= query.split()
query='+'.join(query)
url="https://www.google.co.in/searches_sm=122&source=lnms&tbm=isch&sa=X&ei=4r_cVID3NYayoQTb4ICQBA&ved=0CAgQ_AUoAQ&biw=1242&bih=619&q="+query

print url
header = {'User-Agent': 'Mozilla/5.0'} 
soup = get_soup(url,header)

images = [a['src'] for a in soup.find_all("img", {"src": re.compile("gstatic.com")})]
#print images
for img in images:
  raw_img = urllib2.urlopen(img).read()
  #add the directory for your image here 
  DIR="C:\Users\hp\Pictures\\valentines\\"
  cntr = len([i for i in os.listdir(DIR) if image_type in i]) + 1
  print cntr
  f = open(DIR + image_type + "_"+ str(cntr)+".jpg", 'wb')
  f.write(raw_img)
  f.close()

answered Oct 12 '22 10:10

rishabhr0y

Related questions
                            
                                Python's fromtimestamp returns inconsistent results on different machines
                            
                                Is there any library in C like python's inspect?
                            
                                Python MixIn standards
                            
                                Programmatically Revoke OAuth token for google account
                            
                                Single Django model, multiple tables?
                            
                                How to inherit from MonkeyDevice?
                            
                                Django model fields. Custom field value setter
                            
                                Current pure python solution for facebook-oauth?
                            
                                Cython static link with python runtime?
                            
                                How to identify stripes of different colors
                            
                                How to map the most "similar" strings from one list to another in python?
                            
                                Can an abstract class force the inheriting class to implement a method as static?
                            
                                Python libraries for on-line machine learning MDP
                            
                                some django's logs are missing when host in uwsgi with multiple process
                            
                                How do I create a bounded memoization decorator in Python?
                            
                                Creating dynamic docstrings in Python descriptor
                            
                                Can we know if a Python script is launched from Windows or a textual terminal?
                            
                                How to get status of uploading file in Flask
                            
                                iPhone camera and OpenCV
                            
                                Including static data in setup.py (setuptools)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With