I'm just starting out with Python web data in Python 3.6.1. I was learning sockets and I had a problem with my code which I couldn't figure out. The website in my code works fine, but when I run this code I get a 400 Bad Request error. I am not really sure what the problem with my code is. Thanks in advance.
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
mysock.send(('GET http://data.pr4e.org/romeo.txt HTTP/1.0 \n\n').encode())
while True:
data = mysock.recv(512)
if ( len(data) < 1 ):
break
print (data)
mysock.close()
GET http://data.pr4e.org/romeo.txt HTTP/1.0 \n\n
Welcome in the wonderful world of HTTP where most users think that this is an easy protocol since it is a human readable but in fact it can be a very complex protocol. Given your request above there are several problems:
/romeo.txt
. Full URL's will be used only when doing a request to a proxy.\r\n
not \n
.HTTP/1.0
before the end of the line.With this in mind the data you send should be instead
GET /romeo.txt HTTP/1.0\r\nHost: data.pr4e.org\r\n\r\n
And I've tested that it works perfectly with this modification.
But, given that HTTP is not as simple as it might look I really recommend to use a library like requests for accessing the target. If this looks like too much overhead to you please study the HTTP standard to implement it properly instead of just guessing how HTTP works from some examples - and guessing it wrong.
Note also that servers differ in how forgiving they are regarding broken implementations like yours. Thus, what once worked with one server might not work with the next server or even with the same server after some software upgrade. Using a robust and well tested and maintained library instead of doing everything on your own might thus save you lots of troubles later too.
'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
works for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With