Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing raw HTTP Request

I working on HTTP Traffic Data set which is composed of complete POST and GET request Like given below. I have written code in java that has separated each of these request and saved it as string element in array list. Now i am confused how to parse these raw HTTP request in java is there any method better than manual parsing?

GET http://localhost:8080/tienda1/imagenes/3.gif/ HTTP/1.1
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.8 (like Gecko)
Pragma: no-cache
Cache-control: no-cache
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: x-gzip, x-deflate, gzip, deflate
Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5
Accept-Language: en
Host: localhost:8080
Cookie: JSESSIONID=FB018FFB06011CFABD60D8E8AD58CA21
Connection: close
like image 933
Ali Ahmad Avatar asked Nov 06 '12 16:11

Ali Ahmad


People also ask

What is a raw HTTP request?

The Raw HTTP action sends a HTTP request to a web server. How the response is treated depends on the method, but in general the status code and the response headers are returned in variables defined as part of the page load options.

What is HTTP parsing?

The HTTP Parser interprets a byte stream according to the HTTP specification. This Parser is used by the HTTP Client Connector and by the HTTP Server Connector.

What is request module in Python?

The requests module allows you to send HTTP requests using Python. The HTTP request returns a Response Object with all the response data (content, encoding, status, etc).


3 Answers

Here is a General Http Request Parser For all Method types (GET, POST, etc.) for your convinience:

    package util.dpi.capture;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.StringReader;
import java.util.Hashtable;

/**
 * Class for HTTP request parsing as defined by RFC 2612:
 * 
 * Request = Request-Line ; Section 5.1 (( general-header ; Section 4.5 |
 * request-header ; Section 5.3 | entity-header ) CRLF) ; Section 7.1 CRLF [
 * message-body ] ; Section 4.3
 * 
 * @author izelaya
 *
 */
public class HttpRequestParser {

    private String _requestLine;
    private Hashtable<String, String> _requestHeaders;
    private StringBuffer _messagetBody;

    public HttpRequestParser() {
        _requestHeaders = new Hashtable<String, String>();
        _messagetBody = new StringBuffer();
    }

    /**
     * Parse and HTTP request.
     * 
     * @param request
     *            String holding http request.
     * @throws IOException
     *             If an I/O error occurs reading the input stream.
     * @throws HttpFormatException
     *             If HTTP Request is malformed
     */
    public void parseRequest(String request) throws IOException, HttpFormatException {
        BufferedReader reader = new BufferedReader(new StringReader(request));

        setRequestLine(reader.readLine()); // Request-Line ; Section 5.1

        String header = reader.readLine();
        while (header.length() > 0) {
            appendHeaderParameter(header);
            header = reader.readLine();
        }

        String bodyLine = reader.readLine();
        while (bodyLine != null) {
            appendMessageBody(bodyLine);
            bodyLine = reader.readLine();
        }

    }

    /**
     * 
     * 5.1 Request-Line The Request-Line begins with a method token, followed by
     * the Request-URI and the protocol version, and ending with CRLF. The
     * elements are separated by SP characters. No CR or LF is allowed except in
     * the final CRLF sequence.
     * 
     * @return String with Request-Line
     */
    public String getRequestLine() {
        return _requestLine;
    }

    private void setRequestLine(String requestLine) throws HttpFormatException {
        if (requestLine == null || requestLine.length() == 0) {
            throw new HttpFormatException("Invalid Request-Line: " + requestLine);
        }
        _requestLine = requestLine;
    }

    private void appendHeaderParameter(String header) throws HttpFormatException {
        int idx = header.indexOf(":");
        if (idx == -1) {
            throw new HttpFormatException("Invalid Header Parameter: " + header);
        }
        _requestHeaders.put(header.substring(0, idx), header.substring(idx + 1, header.length()));
    }

    /**
     * The message-body (if any) of an HTTP message is used to carry the
     * entity-body associated with the request or response. The message-body
     * differs from the entity-body only when a transfer-coding has been
     * applied, as indicated by the Transfer-Encoding header field (section
     * 14.41).
     * @return String with message-body
     */
    public String getMessageBody() {
        return _messagetBody.toString();
    }

    private void appendMessageBody(String bodyLine) {
        _messagetBody.append(bodyLine).append("\r\n");
    }

    /**
     * For list of available headers refer to sections: 4.5, 5.3, 7.1 of RFC 2616
     * @param headerName Name of header
     * @return String with the value of the header or null if not found.
     */
    public String getHeaderParam(String headerName){
        return _requestHeaders.get(headerName);
    }
}
like image 179
Igor Zelaya Avatar answered Oct 11 '22 13:10

Igor Zelaya


I [am] working on [an] HTTP Traffic Data set which is composed of complete POST and GET request[s]

So you want to parse a file or list that contains multiple HTTP requests. What data do you want to extract? Anyway here is a Java HTTP parsing class, which can read the method, version and URI used in the request-line, and that reads all headers into a Hashtable.

You can use that one or write one yourself if you feel like reinventing the wheel. Take a look at the RFC to see what a request looks like in order to parse it correctly:

Request       = Request-Line              ; Section 5.1
                    *(( general-header        ; Section 4.5
                     | request-header         ; Section 5.3
                     | entity-header ) CRLF)  ; Section 7.1
                    CRLF
                    [ message-body ]          ; Section 4.3
like image 36
CodeCaster Avatar answered Oct 11 '22 12:10

CodeCaster


If you just want to send the raw request as it is, it's very easy, just send the actual String using a TCP socket!

Something like this:

    Socket socket = new Socket(host, port);

    BufferedWriter out = new BufferedWriter(
            new OutputStreamWriter(socket.getOutputStream(), "UTF8"));

    for (String line : getContents(request)) {
        System.out.println(line);
        out.write(line + "\r\n");
    }

    out.write("\r\n");
    out.flush();

See this blog post by JoeJag for the full code.

UPDATE

I started a project, RawHTTP to provide HTTP parsers for request, responses, headers etc... it turned out so good that it makes it quite easy to write HTTP servers and clients on top of it. Check it out if you're looking for something lowl level.

like image 35
Renato Avatar answered Oct 11 '22 13:10

Renato