Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

plot average of n'th rows in gnuplot

I have some data that I want to plot them with gnuplot. But I have for the same x value many y values, I will show you to understand well:

0 0.650765 0.122225 0.013325 
0 0.522575 0.001447 0.010718 
0 0.576791 0.004277 0.104052 
0 0.512327 0.002268 0.005430 
0 0.530401 0.000000 0.036541 
0 0.518333 0.001128 0.017270
20 0.512864 0.001111 0.005433 
20 0.510357 0.005312 0.000000 
20 0.526809 0.001089 0.033523 
20 0.527076 0.000000 0.034215 
20 0.507166 0.001131 0.000000 
20 0.513868 0.001306 0.004344 
40 0.531742 0.003295 0.0365

In this example, I have 6 values for each x value.So how can I draw the average and the confidence bar(interval) ??

thanks for help

like image 523
wolfgunner Avatar asked Jun 20 '26 17:06

wolfgunner


1 Answers

To do this, you will need some kind of external processing. One possibility would be to use gawk to calculate the required quantities and the feed this auxiliary output to Gnuplot to plot it. For example:

set terminal png enhanced
set output 'test.png'

fName = 'data.dat'
plotCmd(col_num)=sprintf('< gawk -f analyze.awk -v col_num=%d %s', col_num, fName)

set format y '%0.2f'
set xr [-5:25]

plot \
    plotCmd(2) u 1:2:3:4 w yerrorbars pt 3 lc rgb 'dark-red' t 'column 2'

This assumes that the script analyze.awk resides in the same directory from which Gnuplot is launched (otherwise, it would be necessary to modify the path in the -f option of gawk. The script analyze.awk itself reads:

function analyze(x, data){
    n = 0;mean = 0;
    val_min = 0;val_max = 0;

    for(val in data){
        n += 1;
        delta = val - mean;
        mean += delta/n;
        val_min = (n == 1)?val:((val < val_min)?val:val_min);
        val_max = (n == 1)?val:((val > val_max)?val:val_max);
    }
    if(n > 0){
        print x, mean, val_min, val_max;
    }
}

{
    curr = $1;
    yval = $(col_num);

    if(NR==1 || prev != curr){
        analyze(prev, data);
        delete data;
        prev = curr;
    }
    data[yval] = 1;
}

END{
    analyze(curr, data);
}

It directly implements the online algorithm to calculate the mean and for each distinct value of x prints this mean as well as the min/max values.

In the Gnuplot script, the column of interest is then passed to the plotCmd function which prepares the command to be executed and the output of which will be plotted with u 1:2:3:4 w yerrorbars. This syntax means that the confidence interval is stored in the 3rd/4th columns while the value itself (the mean) resides in the second column.

In total, the two scripts above produce the picture below. The confidence interval on the last point is not visible since the example data in your question contain only one record for x=40, thus the min/max values coincide with the mean.

enter image description here

like image 105
ewcz Avatar answered Jun 22 '26 15:06

ewcz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!