Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash - Calculate average and frequency of a column

Tags:

I'm running a shell script that performs a load test on my service. At the end of the test, I get a file that looks like this:

200    2.691
200    2.735
404    1.997
404    2.838
200    1.394
200    1.833

I'd like to calculate the min, max and mean response time for every unique HTTP response code. Something like this -

http    min    max    mean    count
200    1.394  2.735   2.163    4
404    1.997  2.838   2.418    2

The output originates from this command (if that helps):

curl -s -o /dev/null -w "%{http_code}\t%{time_total}\n" $SERVICE_URL

Can someone share pointers on how I can go about achieving this in bash? I looked at http://cacodaemon.de/index.php?id=11 for ideas but couldn't make anything work.

Thanks.

like image 512
AngryPanda Avatar asked Apr 19 '18 00:04

AngryPanda


1 Answers

$ cat tst.awk
{
    min[$1] = ( ($1 in min) && (min[$1] < $2) ? min[$1] : $2 )
    max[$1] = ( ($1 in max) && (max[$1] > $2) ? max[$1] : $2 )
    sum[$1] += $2
    cnt[$1]++
}
END {
    print "http", "min", "max", "mean", "cnt"
    for (key in cnt) {
        print key, min[key], max[key], sprintf("%.3f",sum[key]/cnt[key]), cnt[key]
    }
}

$ awk -f tst.awk file | column -t
http  min    max    mean   cnt
200   1.394  2.735  2.163  4
404   1.997  2.838  2.417  2

The above will work with any awk in any shell on any UNIX box.

like image 121
Ed Morton Avatar answered Sep 28 '22 17:09

Ed Morton