Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse HTTP requests and responses from text file in Go

Tags:

go

Given the following file which holds a HTTP pipelined stream of HTTP requests and HTTP responses.

How can I parse this file into my stream variable?

type Connection struct{
   Request *http.Request
   Response *http.Response
}
stream := make([]Connection, 0)

The raw file:

GET /ubuntu/dists/trusty/InRelease HTTP/1.1
Host: archive.ubuntu.com
Cache-Control: max-age=0
Accept: text/*
User-Agent: Debian APT-HTTP/1.3 (1.0.1ubuntu2)

HTTP/1.1 404 Not Found
Date: Thu, 26 Nov 2015 18:26:36 GMT
Server: Apache/2.2.22 (Ubuntu)
Vary: Accept-Encoding
Content-Length: 311
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /ubuntu/dists/trusty/InRelease was not found on this server.</p>
<hr>
<address>Apache/2.2.22 (Ubuntu) Server at archive.ubuntu.com Port 80</address>
</body></html>
GET /ubuntu/dists/trusty-updates/InRelease HTTP/1.1
Host: archive.ubuntu.com
Cache-Control: max-age=0
Accept: text/*
User-Agent: Debian APT-HTTP/1.3 (1.0.1ubuntu2)

HTTP/1.1 200 OK
Date: Thu, 26 Nov 2015 18:26:37 GMT
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Thu, 26 Nov 2015 18:03:00 GMT
ETag: "fbb7-5257562a5fd00"
Accept-Ranges: bytes
Content-Length: 64439
Cache-Control: max-age=382, proxy-revalidate
Expires: Thu, 26 Nov 2015 18:33:00 GMT

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Origin: Ubuntu
Label: Ubuntu
Suite: trusty-updates
Version: 14.04
Codename: trusty
[... truncated by author]

I know there is http.ReadRequest. What about the Response? Any ideas/feedback/thoughts are appreciated.

like image 958
mattes Avatar asked Nov 27 '15 19:11

mattes


1 Answers

It's actually pretty straightforward:

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "io"
    "io/ioutil"
    "log"
    "net/http"
    "net/http/httputil"
    "os"
)

type Connection struct {
    Request  *http.Request
    Response *http.Response
}

func ReadHTTPFromFile(r io.Reader) ([]Connection, error) {
    buf := bufio.NewReader(r)
    stream := make([]Connection, 0)

    for {
        req, err := http.ReadRequest(buf)
        if err == io.EOF {
            break
        }
        if err != nil {
            return stream, err
        }

        resp, err := http.ReadResponse(buf, req)
        if err != nil {
            return stream, err
        }

        //save response body
        b := new(bytes.Buffer)
        io.Copy(b, resp.Body)
        resp.Body.Close()
        resp.Body = ioutil.NopCloser(b)

        stream = append(stream, Connection{Request: req, Response: resp})
    }
    return stream, nil

}
func main() {
    f, err := os.Open("/tmp/test.http")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    stream, err := ReadHTTPFromFile(f)
    if err != nil {
        log.Fatalln(err)
    }
    for _, c := range stream {
        b, err := httputil.DumpRequest(c.Request, true)
        if err != nil {
            log.Fatal(err)
        }
        fmt.Println(string(b))
        b, err = httputil.DumpResponse(c.Response, true)
        if err != nil {
            log.Fatal(err)
        }
        fmt.Println(string(b))
    }
}

A few notes:

  • There are http.ReadRequest and http.ReadResponse
  • http.ReadRequest and http.ReadResponse can be called over and over again on the same bufio.Reader until EOF and it will "just work"
    • "just working" depends on the Content-Length header being present and correct, so reading the body will place the Reader at the start of the next request/response
    • Read the code to understand exactly what will work and what won't
  • resp.Body must be Closeed per the docs, so we have to copy it to another buffer to keep it
  • Using your example data (modifying Content-Length to match your truncation), this code will output the same Requests and Responses as given
  • httputil.DumpRequest and httputil.DumpResponse won't necessarily dump the HTTP headers in the same order as the input file, so don't expect a diff to be perfect
like image 89
korylprince Avatar answered Sep 22 '22 02:09

korylprince