I'm building a WebClient library. Now I'm implementing a proxy feature, so I am making some research and I saw some code using the CONNECT
method to request a URL.
But checking it within my web browser, it doesn't use the CONNECT
method but calls the GET method instead.
So I'm confused. When I should use both methods?
The HTTP CONNECT method starts two-way communications with the requested resource. It can be used to open a tunnel. For example, the CONNECT method can be used to access websites that use SSL (HTTPS). The client asks an HTTP Proxy server to tunnel the TCP connection to the desired destination.
The most common form of HTTP tunneling is the standardized HTTP CONNECT method. In this mechanism, the client asks an HTTP proxy server to forward the TCP connection to the desired destination. The server then proceeds to make the connection on behalf of the client.
HTTPS proxy can't cache anything as it doesn't see the request sent to the server. With HTTPS proxy you have a channel to the server and the client receives and validates server's certificate (and optionally vice versa). HTTP proxy, on the other hand, sees and has control over the request it received from the client.
connect-proxy is the simple relaying command to make network connection via SOCKS and https proxy. It is mainly intended to be used as proxy command of OpenSSH. You can make SSH session beyond the firewall with this command. Features of connect-proxy are: * Supports SOCKS (version 4/4a/5) and https CONNECT method.
Web tunnels let you send non-HTTP traffic through HTTP connections, allowing other protocols to piggyback on top of HTTP. The most common reason to use web tunnels is to embed non-HTTP traffic inside an HTTP connection, so it can be sent through firewalls that allow only web traffic.
TL;DR a web client uses CONNECT
only when it knows it talks to a proxy and the final URI begins with https://
.
When a browser says:
CONNECT www.google.com:443 HTTP/1.1
it means:
Hi proxy, please open a raw TCP connection to google; any following bytes I write, you just repeat over that connection without any interpretation. Oh, and one more thing. Do that only if you talk to Google directly, but if you use another proxy yourself, instead you just tell them the same
CONNECT
.
Note how this says nothing about TLS (https). In fact CONNECT
is orthogonal to TLS; you can have only one, you can have other, or you can have both of them.
That being said, the intent of CONNECT
is to allow end-to-end encrypted TLS session, so the data is unreadable to a proxy (or a whole proxy chain). It works even if a proxy doesn't understand TLS at all, because CONNECT
can be issued inside plain HTTP and requires from the proxy nothing more than copying raw bytes around.
But the connection to the first proxy can be TLS (https) although it means a double encryption of traffic between you and the first proxy.
Obviously, it makes no sense to CONNECT
when talking directly to the final server. You just start talking TLS and then issue HTTP GET
. The end servers normally disable CONNECT
altogether.
To a proxy, CONNECT
support adds security risks. Any data can be passed through CONNECT
, even ssh hacking attempt to a server on 192.168.1.*, even SMTP sending spam. Outside world sees these attacks as regular TCP connections initiated by a proxy. They don't care what is the reason, they cannot check whether HTTP CONNECT
is to blame. Hence it's up to proxies to secure themselves against misuse.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With