I'm looking for a way to script a transparent forward proxy such as the ones that users point their browsers to in proxy settings.
I've discovered a distinct tradeoff in forward proxies between scriptability and robustness. For example, their are countless proxies developed in Ruby and Python that allow you to inspect each request response and log, modify, filter at will ... however these either fail to proxy everything needed or crash after 20 minutes of use.
On the other hand I suspect that Squid and Apache are quite robust and stable, however for the life of me I can't determine how I can develop dynamic behavior through scripting. Ultimately I would like to set quota's and dynamically filter on that quota. Part of me feels like mixing mod_proxy and mod_perl?? could allow interesting dynamic proxies, but its hard to know where to begin and know if its even possible.
Please advise.
Squid and Apache both have mechanisms to call external scripts for allow/deny decisions per-request. This allows you to use either for their proxy engines, but call your external script per request for processing of arbitrary complexity. Your code only has to manage the business logic, not the heavy lifting.
In Apache, I've never used mod_proxy
in this way, but I have used mod_rewrite
. mod_rewrite also allows you to proxy requests. The RequestMap
directive allows you to pass the decision to an external script:
MapType: prg, MapSource: Unix filesystem path to valid regular file
Here the source is a program, not a map file. To create it you can use a language of your choice, but the result has to be an executable program (either object-code or a script with the magic cookie trick '#!/path/to/interpreter' as the first line).
This program is started once, when the Apache server is started, and then communicates with the rewriting engine via its stdin and stdout file-handles. For each map-function lookup it will receive the key to lookup as a newline-terminated string on stdin. It then has to give back the looked-up value as a newline-terminated string on stdout or the four-character string ``NULL'' if it fails (i.e., there is no corresponding value for the given key).
With Squid, you can get similar functionality via the external_acl_type
directive:
This tag defines how the external acl classes using a helper program should look up the status.
g'luck!
I've been working on a HTTP library in python, written with proxy servers specifically in mind as a use case. It isn't very mature at this point (certainly needs more testing, and unit tests), but it's complete enough that I find it useful. I don't know if it would meet any of your needs or not.
The library is called httpmessage, the google-code site is found here. There is an example of writing a proxy server on the examples page.
I'm happy to receive feedback and/or bug fixes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With