From the other posts on stack overflow this should be working
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("www.cnn.com" , 80))
s.sendall("GET / HTTP/1.1\r\n")
print s.recv(4096)
s.close
but for some reason it just hangs (at recv
) and never prints. I know that a request to www.cnn.com will chunk it's data but I should at least read something from recv
, right?
p.s. I know this isn't the best way to do it and that there are library like
httplib
andurllib2
out there, but I can't use those for this project (it's for school). I have to use thesocket
library
You forgot to send a blank line after your request line:
s.sendall("GET / HTTP/1.1\r\n\r\n")
Furthermore, HTTP 1.1 specifies you should add the Host
header field as documented in the Host section in the HTTP 1.1 RFC.
s.sendall("GET / HTTP/1.1\r\nHost: www.cnn.com\r\n\r\n")
Your code is almost right, but you need to send 2 \r\n
sequences to satisfy the HTTP protocol.
A valid GET request will look like this (note 2 lines):
GET / HTTP/1.1
So your code should be:
s.sendall('GET / HTTP/1.1\r\n\r\n')
Further to that, there are additional headers required for valid HTTP 1.1 requests, such as Host:
. You need to add them to your request, something like this:
s.sendall('''GET / HTTP/1.1
Host: cnn.com
''')
Sorry to waste everyone's time. I just found this solution here on Stack Overflow (just took some rewording in my Google search to find)
import socket
request = b"GET / HTTP/1.1\nHost: www.cnn.com\n\n"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("cnn.com", 80))
s.send(request)
result = s.recv(10000)
while (len(result) > 0):
print(result)
result = s.recv(10000)
And all of the answers were right as well about the ending \r\n\r\n
however those returned 301
statuses. This solution seems to follow the redirect somehow? Anyways, this solutions worked for me
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With