Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Normalizing histogram bins in gnuplot

I'm trying to plot a histogram whose bins are normalized by the number of elements in the bin.

I'm using the following

binwidth=5
bin(x,width)=width*floor(x/width) + binwidth/2.0
plot 'file' using (bin($2, binwidth)):($4) smooth freq with boxes

to get a basic histogram, but I want the value of each bin to be divided by the size of the bin. How can I go about this in gnuplot, or using external tools if necessary?

like image 852
shivknight Avatar asked Apr 26 '11 07:04

shivknight


1 Answers

In gnuplot 4.4, functions take on a different property, in that they can execute multiple successive commands, and then return a value (see gnuplot tricks) This means that you can actually calculate the number of points, n, within the gnuplot file without having to know it in advance. This code runs for a file, "out.dat", containing one column: a list of n samples from a normal distribution:

binwidth = 0.1
set boxwidth binwidth
sum = 0

s(x)          = ((sum=sum+1), 0)
bin(x, width) = width*floor(x/width) + binwidth/2.0

plot "out.dat" u ($1):(s($1))
plot "out.dat" u (bin($1, binwidth)):(1.0/(binwidth*sum)) smooth freq w boxes

The first plot statement reads through the datafile and increments sum once for each point, plotting a zero.

The second plot statement actually uses the value of sum to normalise the histogram.

like image 56
Nick Avatar answered Sep 21 '22 00:09

Nick