I do not know what the proper name is for such proxy server, you're welcome to fix my question title.
When I search proxy server on google, a lot implements like maproxy or a-python-proxy-in-less-than-100-lines-of-code. Those proxies server seems just ask remote server to get a certain url address.
I want to build a proxy server, which contains a proxy pool(a list of http/https proxies) and only have one IP address and one port to serve incoming requests. When a request comes, it would choose a proxy from the pool and do this request, and return result back.
For example I have a VPS which IP '192.168.1.66'. I start proxy server at this VPS with IP '127.0.0.1' and port '8080'.
I can then use this proxy like below.
import requests
url = 'http://www.google.com'
headers = {
...
}
proxies = {
'http': 'http://192.168.1.66:8080'
}
r = requests.get(url, headers=headers, proxies=proxies)
I have see some impelement like:
from twisted.web import proxy, http
from twisted.internet import reactor
from twisted.python import log
import sys
log.startLogging(sys.stdout)
class ProxyFactory(http.HTTPFactory):
protocol = proxy.Proxy
reactor.listenTCP(8080, ProxyFactory())
reactor.run()
It works, but it is so simple that I have no idea how it works and how to improve this code to use a proxy pool.
from hidu/proxy-manager , which write by golang .
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ client (want visit http://www.baidu.com/) +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
| via proxy 127.0.0.1:8090
|
V
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ + proxy pool +
+ proxy manager listen ++++++++++++++++++++++++++++++++++
+ on (127.0.0.1:8090) + http_proxy1,http_proxy2, +
+ + socks5_proxy1,socks5_proxy2 +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|
| choose one proxy visit
| www.baidu.com
|
V
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ site:www.baidu.com +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
To use a proxy in Python, first import the requests package. Next create a proxies dictionary that defines the HTTP and HTTPS connections. This variable should be a dictionary that maps a protocol to the proxy URL. Additionally, make a url variable set to the webpage you're scraping from.
var httpProxy = require('http-proxy') var proxy = httpProxy. createProxy(); var options = { 'foo.com': 'http://website.com:8001', 'bar.com': 'http://website2.com:8002' } require('http'). createServer(function(req, res) { proxy. web(req, res, { target: options[req.
Your Proxy Pool concept is not hard to implement. If I understand correctly, you want to make following.
So, I've write simple proxy server using Flask and Requests.
from flask import Flask, Response
import random
app = Flask(__name__)
@app.route('/p/<path:url>')
def proxy(url):
""" Request to this like /p/www.google.com
"""
url = 'http://{}'.format(url)
r = get_response(url)
return Response(stream_with_context(r.iter_content()),
content_type=r.headers['content-type'])
def get_proxy():
# This is your "Proxy Pool"
proxies = [
'http://proxy-server-1.com',
'http://proxy-server-2.com',
'http://proxy-server-3.com',
]
return random.choice(proxies)
def get_response(target_url):
proxy = get_proxy();
url = "{}/p/{}".format(proxy, target_url)
# Above line will generate like http://proxy-server-1.com/p/www.google.com
return requests.get(url, stream=True)
if __name__ == '__main__':
app.run()
Then, you can start here to improve your proxy server.
Common Proxy Pool
, or Proxy Manager
can check availability, speed, and more stats of it's proxies, and select best proxy to send request. And of course, this example handle only simple request, and you can add features handle request args, methods, protocols.
Hope this helpful!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With