Creating a raw HTTP request with sockets

Tags:

I would like to be able to construct a raw HTTP request and send it with a socket. Obviously, you would like me to use something like urllib and urllib2 but I do not want to use that.

It would have to look something like this:

import socket

tcpsoc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcpsoc.bind(('72.14.192.58', 80)) #bind to googles ip
tcpsoc.send('HTTP REQUEST')
response = tcpsoc.recv()

Obviously you would also have to request the page/file and get and post parameters

745

asked Apr 22 '11 12:04

Jacob Valenta

3 Answers

import socket
import urlparse


CONNECTION_TIMEOUT = 5
CHUNK_SIZE = 1024
HTTP_VERSION = 1.0
CRLF = "\r\n\r\n"

socket.setdefaulttimeout(CONNECTION_TIMEOUT)


def receive_all(sock, chunk_size=CHUNK_SIZE):
    '''
    Gather all the data from a request.
    '''
    chunks = []
    while True:
        chunk = sock.recv(int(chunk_size))
        if chunk:
            chunks.append(chunk)
        else:
            break

    return ''.join(chunks)



def get(url, **kw):
    kw.setdefault('timeout', CONNECTION_TIMEOUT)
    kw.setdefault('chunk_size', CHUNK_SIZE)
    kw.setdefault('http_version', HTTP_VERSION)
    kw.setdefault('headers_only', False)
    kw.setdefault('response_code_only', False)
    kw.setdefault('body_only', False)
    url = urlparse.urlparse(url)
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.settimeout(kw.get('timeout'))
    sock.connect((url.netloc, url.port or 80))
    msg = 'GET {0} HTTP/{1} {2}'
    sock.sendall(msg.format(url.path or '/', kw.get('http_version'), CRLF))
    data = receive_all(sock, chunk_size=kw.get('chunk_size'))
    sock.shutdown(socket.SHUT_RDWR)
    sock.close()

    data = data.decode(errors='ignore')
    headers = data.split(CRLF, 1)[0]
    request_line = headers.split('\n')[0]
    response_code = request_line.split()[1]
    headers = headers.replace(request_line, '')
    body = data.replace(headers, '').replace(request_line, '')


    if kw['body_only']:
        return body
    if kw['headers_only']:
        return headers
    if kw['response_code_only']:
        return response_code
    else:
        return data


print(get('http://www.google.com/'))

146

answered Sep 30 '22 11:09

Ricky Wilson

Most of what you need to know is in the HTTP/1.1 spec, which you should definitely study if you want to roll your own HTTP implementation: http://www.w3.org/Protocols/rfc2616/rfc2616.html

answered Sep 30 '22 11:09

Kristopher Johnson

Yes, basically you just have to write text, something like :

GET /pageyouwant.html HTTP/1.1[CRLF]
Host: google.com[CRLF]
Connection: close[CRLF]
User-Agent: MyAwesomeUserAgent/1.0.0[CRLF]
Accept-Encoding: gzip[CRLF]
Accept-Charset: ISO-8859-1,UTF-8;q=0.7,*;q=0.7[CRLF]
Cache-Control: no-cache[CRLF]
[CRLF]

Feel free to remove / add headers at will.

answered Sep 30 '22 12:09

user703016

Related questions
                            
                                Skimage: how to show image
                            
                                if else function in pandas dataframe [duplicate]
                            
                                Python order dataframe alphabetically
                            
                                How to explode multiple columns of a dataframe in pyspark
                            
                                In Python, how can you get the name of a member function's class?
                            
                                Determining if stdout for a Python process is redirected
                            
                                Using client certificates with urllib2
                            
                                How to use `pytest` from Python?
                            
                                python, format string
                            
                                Simulate keystroke in Linux with Python
                            
                                Using Numpy with pypy
                            
                                Round to the nearest 500, Python
                            
                                Cannot get minor grid lines to appear in matplotlib figure
                            
                                Trying to find majority element in a list
                            
                                Format APNS-style JSON message in Python for use with Amazon SNS
                            
                                How to convert a .ui file to .py file
                            
                                Python not a standardized language?
                            
                                Do dicts preserve iteration order if they are not modified?
                            
                                Python: How to find if a path exists between 2 nodes in a graph?
                            
                                Cannot install psycopg2 on OSX 10.6.7 with XCode4

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Creating a raw HTTP request with sockets

Tags:

python

http

sockets