Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Python 3.2, I can open and read an HTTPS web page with http.client, but urllib.request is failing to open the same page

I want to open and read https://yande.re/ with urllib.request, but I'm getting an SSL error. I can open and read the page just fine using http.client with this code:

import http.client

conn = http.client.HTTPSConnection('www.yande.re')
conn.request('GET', 'https://yande.re/')
resp = conn.getresponse()
data = resp.read()

However, the following code using urllib.request fails:

import urllib.request

opener = urllib.request.build_opener()
resp = opener.open('https://yande.re/')
data = resp.read()

It gives me the following error: ssl.SSLError: [Errno 1] _ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list. Why can I open the page with HTTPSConnection but not opener.open?

Edit: Here's my OpenSSL version and the traceback from trying to open https://yande.re/

>>> import ssl; ssl.OPENSSL_VERSION
'OpenSSL 1.0.0a 1 Jun 2010'
>>> import urllib.request
>>> urllib.request.urlopen('https://yande.re/')
Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    urllib.request.urlopen('https://yande.re/')
  File "C:\Python32\lib\urllib\request.py", line 138, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python32\lib\urllib\request.py", line 369, in open
    response = self._open(req, data)
  File "C:\Python32\lib\urllib\request.py", line 387, in _open
    '_open', req)
  File "C:\Python32\lib\urllib\request.py", line 347, in _call_chain
    result = func(*args)
  File "C:\Python32\lib\urllib\request.py", line 1171, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\Python32\lib\urllib\request.py", line 1138, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 1] _ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list>
>>> 
like image 655
user1406902 Avatar asked May 21 '12 01:05

user1406902


People also ask

Does Python support https?

HTTPS support is only available if Python was compiled with SSL support (through the ssl module).

Why do we use Urllib in Python?

Urllib package is the URL handling module for python. It is used to fetch URLs (Uniform Resource Locators). It uses the urlopen function and is able to fetch URLs using a variety of different protocols.


1 Answers

What a coincidence! I'm having the same problem as you are, with an added complication: I'm behind a proxy. I found this bug report regarding https-not-working-with-urllib. Luckily, they posted a workaround.

import urllib.request
import ssl

##uncomment this code if you're behind a proxy
##https port is 443 but it doesn't work for me, used port 80 instead

##proxy_auth = '{0}://{1}:{2}@{3}'.format('https', 'username', 'password', 
##             'proxy:80')
##proxies = { 'https' : proxy_auth }
##proxy = urllib.request.ProxyHandler(proxies)
##proxy_auth_handler = urllib.request.HTTPBasicAuthHandler()
##opener = urllib.request.build_opener(proxy, proxy_auth_handler, 
##                                     https_sslv3_handler)

https_sslv3_handler = 
         urllib.request.HTTPSHandler(context=ssl.SSLContext(ssl.PROTOCOL_SSLv3))
opener = urllib.request.build_opener(https_sslv3_handler)
urllib.request.install_opener(opener)
resp = opener.open('https://yande.re/')
data = resp.read().decode('utf-8')
print(data)

Btw, thanks for showing how to use http.client. I didn't know that there's another library that can be used to connect to the internet. ;)

like image 176
Annie Lagang Avatar answered Sep 18 '22 15:09

Annie Lagang