Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: Bad request syntax when redirected from Apache mod_rewrite

I have Django running on port 8000, and Apache on 80. I configured the following rewrite rule in apache to redirect to django:

RewriteRule ^/?checkout/ http://%{HTTP_HOST}:8000/checkout/ [L,QSA]

If open a url in a browser, it works fine and redirects perfectly.

However, the external client (which works well when connecting to django directly without apache) always causes a Bad Request Syntax error on Django server. Heres' snippet from Django Log. It looks like Apache automatically appending those "Content-length" stuff to the query, why?

[05/Mar/2014 18:01:35] code 400, message Bad request syntax ('GET /checkout/wx_signature?signature=b226bb8f6e9ce2fdecb752c6808a979c62e235f7&echostr=5987526888415258224&timestamp=1394042480&nonce=1394079741Content-Length: 445Connection: closeContent-Type: text/html; charset=iso-8859-1 HTTP/1.0')
like image 750
Tom Ding Avatar asked Mar 05 '14 18:03

Tom Ding


2 Answers

tl;dr: This caused by a bug in your "external client". It is a badly designed HTTP client and should be avoided because not only does it cause this bug, it may open up avenues for security exploits.

In order to understand what's happening, you need to work backwards.


First, let's start with the line of log from the Django built-in server:

[05/Mar/2014 18:01:35] code 400, message Bad request syntax ('GET /checkout/wx_signature?signature=b226bb8f6e9ce2fdecb752c6808a979c62e235f7&echostr=5987526888415258224&timestamp=1394042480&nonce=1394079741Content-Length: 445Connection: closeContent-Type: text/html; charset=iso-8859-1 HTTP/1.0')

"code 400" refers to the HTTP status code 400. It means the actual HTTP request is badly constructed and cannot be understood. Luckily, Django logs the erroneous input so we can analyze it.


Now that we understand the nature of the problem, we'll remove the irrelevant log fluff and the long signature to take a deeper look at the actual request:

GET /checkout/wx_signature?[SIGNATURE REMOVED]Content-Length: 445Connection: closeContent-Type: text/html; charset=iso-8859-1 HTTP/1.0

Here we see an invalid first line of an HTTP request.


From RFC2616 Section 5.1:

The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by SP characters. No CR or LF is allowed except in the final CRLF sequence.

   Request-Line   = Method SP Request-URI SP HTTP-Version CRLF

In the invalid request, we can identify that the HTTP verb GET is there and the version ending of HTTP/1.0 is there, so these are not the problem. The middle part, which is supposed to be the URL is as follows:

/checkout/wx_signature?[SIGNATURE REMOVED]Content-Length: 445Connection: closeContent-Type: text/html; charset=iso-8859-1

Spaces in the URL are normally replaced with + or %20 before being sent to the server. As you can see, this is not the case here and it is the cause of the invalid request. A good HTTP client would never have done this as it would automatically escape the URL. This is a red flag that the "external client" you are using is of poor quality.


Notice that the space appears there along side a number of strange looking fields.

If you look at RFC2616 Section 14.13, you'll see that Content-Length is actually the name of an HTTP 1.1 header. This is also the case for Connection, and Content-Type.

It obviously does not belong there and so why was it concatenated with the URL?

From here I can only make guesses since I don't have access to your code. However, I think I have a pretty good idea of what's happening.


Let's understand the nature of HTTP headers for a moment. We'll send a raw request to emulate what happens if we visit "http://google.com". This will trigger Google to redirect us to "http://www.google.com".

Raw Request:

GET / HTTP/1.1
Host: google.com

Raw Response:

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Thu, 15 May 2014 21:28:46 GMT
Expires: Sat, 14 Jun 2014 21:28:46 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic

[HTML content removed]

Whoa, Google returned a whole bunch of headers! We're only interested in the first few lines though:

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
...
Content-Length: 219

Here you can see that Content-Type, Content-Length, and other headers follow the Location header. The actual order doesn't matter normally since the HTTP client or server is smart enough to understand what each one means. However, what if you strip the line endings after the Location header?

You'll end up with something like this:

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/Content-Type: text/html; charset=UTF-8Content-Length: 219

Uh-oh... if you are a HTTP client, you'll think that I want to redirect you to http://www.google.com/Content-Type: text/html; charset=UTF-8Content-Length: 219.


This looks exactly like the symptom you have... but why does that happen?

It's very unlikely that Apache returned the header in this corrupted form (unless you custom coded a plugin or something to that effect).

It's also unlikely your "external client" purposely stripped the line endings in the header after receiving it.

A likely case is that the "external client" was coded to interpret everything before the content and after the Location: as the URL and somewhere afterwards strips the CRLF characters (commonly done for security reasons when dealing with HTTP headers, ironically done incorrectly in this case). The fact that the client tries to send the request with HTTP/1.0 instead of HTTP/1.1 supports this because HTTP/1.0 clients are usually very limited in terms of features and tends to make heavy assumptions based upon its obsolete knowledge.

It's also probable that your "external client" read the entire header after the request line into a string and the string handler automatically stripped the CRLFs.


I think it's pretty clear that the problem rests in the "external client", though there's not enough information to dig into it.

I suggest you use a different client or library to do the request.

like image 179
user193130 Avatar answered Oct 17 '22 00:10

user193130


This message seems to appear when you are using HTTPS url with Django. You might need to configure also a HTTPS configuration in Apache2, inspired from this question for example: SSL based virtual host with django and mod_wsgi

like image 2
Maxime Lorant Avatar answered Oct 17 '22 00:10

Maxime Lorant