Python urllib getting access denied when browser works

Tags:

I’m trying to download a CSV file from this site:

http://www.nasdaq.com/screening/companies-by-name.aspx

If I enter this URL in my Chrome browser the csv file download starts immediately, and I get a file with data on a few thousand companies. However, if I use the code below I get a access denied error. There is no login on this page, so what is the Python code doing differently?

from urllib import urlopen

response = urlopen('http://www.nasdaq.com/screening/companies-by-name.aspx?&render=download')
csv = response.read()

# Save the string to a file
csvstr = str(csv).strip("b'")

lines = csvstr.split("\\n")
f = open("C:\Users\Ankit\historical.csv", "w")
for line in lines:
   f.write(line + "\n")
f.close()

606

asked Jul 25 '14 18:07

user3878070

2 Answers

The user agent headers for urllib2 (and similar urllib) is "Python-urllib/2.7" (replace 2.7 by your version of Python).

You're getting a 403 error because the NASDAQ server doesn't seem to want to send content to this user agent. You can “spoof” the user agent header, and then it downloads successfully. Here’s a minimal example:

import urllib2

DOWNLOAD_URL = 'http://www.nasdaq.com/screening/companies-by-name.aspx?&render=download'

hdr = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11'}
req = urllib2.Request(DOWNLOAD_URL, headers=hdr)

try:
    page = urllib2.urlopen(req)
except urllib2.HTTPError, e:
    print e.fp.read()

content = page.read()
print content

138

answered Sep 18 '22 22:09

alexwlchan

Or you can use python-requests

import requests

url = 'http://www.nasdaq.com/screening/companies-by-name.aspx'
params = {'':'', 'render':'download'}
resp = requests.get(url, params=params)
print resp.text

answered Sep 19 '22 22:09

Gaurav Jain

Related questions
                            
                                Updating Text In Entry (Tkinter)
                            
                                Correct way to initialize global variables in Python
                            
                                Pillow resize pixelating images - Django/Pillow
                            
                                how to Label the edges of a graph using python and graphviz
                            
                                Mock for line in open(file):
                            
                                Django - passing an attribute value to as_view in TemplateView
                            
                                numpy: multiplying a 2D array by a 1D array
                            
                                Is there a way to make matplotlib scatter plot marker or color according to a discrete variable in a different column?
                            
                                How to compare two ctypes objects for equality?
                            
                                Auto_increment custom Primary Key in Peewee model
                            
                                How can I read exactly one response chunk with python's http.client?
                            
                                Get parent user after sudo with Python
                            
                                pygame - Scrolling down page
                            
                                Python Asyncio subprocess never finishes
                            
                                unsupported format character '_' (0x5f) at index 1
                            
                                Get Python command line output when called from a bash script
                            
                                Popen with conflicting executable/path
                            
                                Pythonic way to get the single element of a 1-sized list
                            
                                Partial derivative in Python
                            
                                Django Templates and MongoDB _id

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python urllib getting access denied when browser works

Tags:

python

urllib

python-2.7

user3878070

People also ask

2 Answers

alexwlchan

Gaurav Jain

Recent Activity

Donate For Us