HTTPError: HTTP Error 403: Forbidden

Tags:

I making a python script for personal use but it's not working for wikipedia...

This work:

import urllib2, sys
from bs4 import BeautifulSoup

site = "http://youtube.com"
page = urllib2.urlopen(site)
soup = BeautifulSoup(page)
print soup

This not work:

import urllib2, sys
from bs4 import BeautifulSoup

site= "http://en.wikipedia.org/wiki/StackOverflow"
page = urllib2.urlopen(site)
soup = BeautifulSoup(page)
print soup

This is the error:

Traceback (most recent call last):
  File "C:\Python27\wiki.py", line 5, in <module>
    page = urllib2.urlopen(site)
  File "C:\Python27\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 406, in open
    response = meth(req, response)
  File "C:\Python27\lib\urllib2.py", line 519, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\lib\urllib2.py", line 444, in error
    return self._call_chain(*args)
  File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 527, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden

579

asked Oct 24 '12 18:10

a1204773

1 Answers

Within the current code:

Python 2.X

import urllib2, sys
from BeautifulSoup import BeautifulSoup

site= "http://en.wikipedia.org/wiki/StackOverflow"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(site,headers=hdr)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
print soup

Python 3.X

from bs4 import BeautifulSoup
from urllib.request import Request, urlopen

site= "http://en.wikipedia.org/wiki/StackOverflow"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = Request(site,headers=hdr)
page = urlopen(req)
soup = BeautifulSoup(page)
print(soup)

Python 3.X with Selenium (Javascript functions execution)

from selenium import webdriver as driver

browser = driver.PhantomJS()
p = browser.get("http://en.wikipedia.org/wiki/StackOverflow")
assert "Stack Overflow - Wikipedia" in browser.title

The reason modified version works is because Wikipedia checks for User-Agent to be of "popular browser"

189

answered Sep 17 '22 08:09

Supreet Sethi

Related questions
                            
                                Logging module not working with Python3
                            
                                Retain feature names after Scikit Feature Selection
                            
                                How to change column names in pandas Dataframe using a list of names?
                            
                                python is not recognized windows 10
                            
                                How to check if an element is an empty list in pandas?
                            
                                How do you create line segments between two points?
                            
                                How to generate exponentially increasing range in Python
                            
                                PIL Best Way To Replace Color?
                            
                                Add tuple to list of tuples in Python
                            
                                How to to read a matrix from a given file?
                            
                                Comparing Python dictionaries and nested dictionaries
                            
                                Pandas: replace substring in string
                            
                                AWS ElasticBeanstalk CLI in OS X: EB Command Not Found
                            
                                How to fix 'Object arrays cannot be loaded when allow_pickle=False' in the sketch_rnn algorithm
                            
                                Create custom buttons in admin change_form in Django
                            
                                PyQt dialog - How to make it quit after pressing a button?
                            
                                Python Saving JSON Files as UTF-8
                            
                                How can I get the first day of the next month in Python?
                            
                                Python XLWT attempt to overwrite cell workaround
                            
                                Coming from C, how should I learn Python? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

HTTPError: HTTP Error 403: Forbidden

Tags:

python

beautifulsoup

python-2.7