I need to write a script that connects to a bunch of sites on our corporate intranet over HTTPS and verifies that their SSL certificates are valid; that they are not expired, that they are issued for the correct address, etc. We use our own internal corporate Certificate Authority for these sites, so we have the public key of the CA to verify the certificates against.
Python by default just accepts and uses SSL certificates when using HTTPS, so even if a certificate is invalid, Python libraries such as urllib2 and Twisted will just happily use the certificate.
Is there a good library somewhere that will let me connect to a site over HTTPS and verify its certificate in this way?
How do I verify a certificate in Python?
Use pyOpenSSL. You can also access additional components, e.g. organisation ( subject. O / issuer. O ), organisational unit ( subject.
I have added a distribution to the Python Package Index which makes the match_hostname()
function from the Python 3.2 ssl
package available on previous versions of Python.
http://pypi.python.org/pypi/backports.ssl_match_hostname/
You can install it with:
pip install backports.ssl_match_hostname
Or you can make it a dependency listed in your project's setup.py
. Either way, it can be used like this:
from backports.ssl_match_hostname import match_hostname, CertificateError ... sslsock = ssl.wrap_socket(sock, ssl_version=ssl.PROTOCOL_SSLv3, cert_reqs=ssl.CERT_REQUIRED, ca_certs=...) try: match_hostname(sslsock.getpeercert(), hostname) except CertificateError, ce: ...
You can use Twisted to verify certificates. The main API is CertificateOptions, which can be provided as the contextFactory
argument to various functions such as listenSSL and startTLS.
Unfortunately, neither Python nor Twisted comes with a the pile of CA certificates required to actually do HTTPS validation, nor the HTTPS validation logic. Due to a limitation in PyOpenSSL, you can't do it completely correctly just yet, but thanks to the fact that almost all certificates include a subject commonName, you can get close enough.
Here is a naive sample implementation of a verifying Twisted HTTPS client which ignores wildcards and subjectAltName extensions, and uses the certificate-authority certificates present in the 'ca-certificates' package in most Ubuntu distributions. Try it with your favorite valid and invalid certificate sites :).
import os import glob from OpenSSL.SSL import Context, TLSv1_METHOD, VERIFY_PEER, VERIFY_FAIL_IF_NO_PEER_CERT, OP_NO_SSLv2 from OpenSSL.crypto import load_certificate, FILETYPE_PEM from twisted.python.urlpath import URLPath from twisted.internet.ssl import ContextFactory from twisted.internet import reactor from twisted.web.client import getPage certificateAuthorityMap = {} for certFileName in glob.glob("/etc/ssl/certs/*.pem"): # There might be some dead symlinks in there, so let's make sure it's real. if os.path.exists(certFileName): data = open(certFileName).read() x509 = load_certificate(FILETYPE_PEM, data) digest = x509.digest('sha1') # Now, de-duplicate in case the same cert has multiple names. certificateAuthorityMap[digest] = x509 class HTTPSVerifyingContextFactory(ContextFactory): def __init__(self, hostname): self.hostname = hostname isClient = True def getContext(self): ctx = Context(TLSv1_METHOD) store = ctx.get_cert_store() for value in certificateAuthorityMap.values(): store.add_cert(value) ctx.set_verify(VERIFY_PEER | VERIFY_FAIL_IF_NO_PEER_CERT, self.verifyHostname) ctx.set_options(OP_NO_SSLv2) return ctx def verifyHostname(self, connection, x509, errno, depth, preverifyOK): if preverifyOK: if self.hostname != x509.get_subject().commonName: return False return preverifyOK def secureGet(url): return getPage(url, HTTPSVerifyingContextFactory(URLPath.fromString(url).netloc)) def done(result): print 'Done!', len(result) secureGet("https://google.com/").addCallback(done) reactor.run()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With