Go newbie here!
I am trying to put together a Go program that will parse a log file and return specific information on lines matched.
To give an example of what I am trying to achieve I would start with a log file that looks like this:
2019-09-30T04:17:02 - REQUEST-A
2019-09-30T04:18:02 - REQUEST-C
2019-09-30T04:19:02 - REQUEST-B
2019-09-30T04:20:02 - REQUEST-A
2019-09-30T04:21:02 - REQUEST-A
2019-09-30T04:22:02 - REQUEST-B
From here I would want to extract all "REQUEST-A" and either print the time the request occurred to the terminal or to a file.
I have tried using os.Open and scanner and I can use scanner.Text to log that it has found occurrence of my string, like so:
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
request := 0
f, err := os.Open("request.log")
if err != nil {
fmt.Print("There has been an error!: ", err)
}
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
if strings.Contains(scanner.Text(), "REQUEST-A") {
request = request + 1
}
if err := scanner.Err(); err != nil {
}
fmt.Println(request)
}
}
But I am unsure of how to take this to use it to retrieve the information I am after. Normally I would use Bash for this but I thought I would branch out and see if I could use Go. Any advise would be appreciated.
In Go, we try to be efficient. Don't do things unneccessarily.
For example,
package main
import (
"bufio"
"bytes"
"fmt"
"os"
)
func main() {
lines, requestA := 0, 0
f, err := os.Open("request.log")
if err != nil {
fmt.Print("There has been an error!: ", err)
}
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
lines++
// filter request a
line := scanner.Bytes()
if len(line) <= 30 || line[30] != 'A' {
continue
}
if !bytes.Equal(line[22:], []byte("REQUEST-A")) {
continue
}
requestA++
request := string(line)
// handle request a
fmt.Println(request)
}
if err := scanner.Err(); err != nil {
fmt.Println(err)
}
fmt.Println(lines, requestA)
}
Output:
$ go run request.go
2019-09-30T04:17:02 - REQUEST-A
2019-09-30T04:20:02 - REQUEST-A
2019-09-30T04:21:02 - REQUEST-A
6 3
$ cat request.log
2019-09-30T04:17:02 - REQUEST-A
2019-09-30T04:18:02 - REQUEST-C
2019-09-30T04:19:02 - REQUEST-B
2019-09-30T04:20:02 - REQUEST-A
2019-09-30T04:21:02 - REQUEST-A
2019-09-30T04:22:02 - REQUEST-B
To emphasize the importance of efficiency (logs can be very large), let's run a benchmark against Markus W Mahlberg's solution: https://play.golang.org/p/R2D_BeiJvx9.
$ go test log_test.go -bench=. -benchmem
BenchmarkPeterSO-4 21285 56953 ns/op 4128 B/op 2 allocs/op
BenchmarkMarkusM-4 649 1817868 ns/op 84747 B/op 2390 allocs/op
log_test.go
:
package main
import (
"bufio"
"bytes"
"regexp"
"strings"
"testing"
)
var requestLog = `
2019-09-30T04:17:02 - REQUEST-A
2019-09-30T04:18:02 - REQUEST-C
2019-09-30T04:19:02 - REQUEST-B
2019-09-30T04:20:02 - REQUEST-A
2019-09-30T04:21:02 - REQUEST-A
2019-09-30T04:22:02 - REQUEST-B
`
var benchLog = strings.Repeat(requestLog[1:], 256)
func BenchmarkPeterSO(b *testing.B) {
for N := 0; N < b.N; N++ {
scanner := bufio.NewScanner(strings.NewReader(benchLog))
for scanner.Scan() {
// filter request a
line := scanner.Bytes()
if len(line) <= 30 || line[30] != 'A' {
continue
}
if !bytes.Equal(line[22:], []byte("REQUEST-A")) {
continue
}
request := string(line)
// handle request a
_ = request
}
if err := scanner.Err(); err != nil {
b.Fatal(err)
}
}
}
func BenchmarkMarkusM(b *testing.B) {
for N := 0; N < b.N; N++ {
var re *regexp.Regexp = regexp.MustCompile(`^(\S*) - REQUEST-A$`)
scanner := bufio.NewScanner(strings.NewReader(benchLog))
var res []string
for scanner.Scan() {
if res = re.FindStringSubmatch(scanner.Text()); len(res) > 0 {
_ = res[1]
}
}
if err := scanner.Err(); err != nil {
b.Fatal(err)
}
}
}
Use the following code to print the time field for log entries with the value field "REQUEST-A".
for scanner.Scan() {
line := scanner.Text()
if len(line) < 19 {
continue
}
if line[19:] == " - REQUEST-A" {
fmt.Println(line[:19])
}
}
Run it on the Go play ground!
To write to a file, redirect stdout to a file.
The code above assumes that everything after the timestamp is "- REQUEST-A". Use the following if "- REQUEST-A" is a prefix to other data:
const lenTimestamp = 19
for scanner.Scan() {
line := scanner.Text()
if len(line) < lenTimestamp {
continue
}
if strings.HasPrefix(line[lenTimestamp:], " - REQUEST-A") {
fmt.Println(line[:lenTimestamp])
}
}
Run this version on the playground.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With