I have some data that I want to plot them with gnuplot. But I have for the same x value many y values, I will show you to understand well:
0 0.650765 0.122225 0.013325
0 0.522575 0.001447 0.010718
0 0.576791 0.004277 0.104052
0 0.512327 0.002268 0.005430
0 0.530401 0.000000 0.036541
0 0.518333 0.001128 0.017270
20 0.512864 0.001111 0.005433
20 0.510357 0.005312 0.000000
20 0.526809 0.001089 0.033523
20 0.527076 0.000000 0.034215
20 0.507166 0.001131 0.000000
20 0.513868 0.001306 0.004344
40 0.531742 0.003295 0.0365
In this example, I have 6 values for each x value.So how can I draw the average and the confidence bar(interval) ??
thanks for help
To do this, you will need some kind of external processing. One possibility would be to use gawk to calculate the required quantities and the feed this auxiliary output to Gnuplot to plot it. For example:
set terminal png enhanced
set output 'test.png'
fName = 'data.dat'
plotCmd(col_num)=sprintf('< gawk -f analyze.awk -v col_num=%d %s', col_num, fName)
set format y '%0.2f'
set xr [-5:25]
plot \
plotCmd(2) u 1:2:3:4 w yerrorbars pt 3 lc rgb 'dark-red' t 'column 2'
This assumes that the script analyze.awk resides in the same directory from which Gnuplot is launched (otherwise, it would be necessary to modify the path in the -f option of gawk. The script analyze.awk itself reads:
function analyze(x, data){
n = 0;mean = 0;
val_min = 0;val_max = 0;
for(val in data){
n += 1;
delta = val - mean;
mean += delta/n;
val_min = (n == 1)?val:((val < val_min)?val:val_min);
val_max = (n == 1)?val:((val > val_max)?val:val_max);
}
if(n > 0){
print x, mean, val_min, val_max;
}
}
{
curr = $1;
yval = $(col_num);
if(NR==1 || prev != curr){
analyze(prev, data);
delete data;
prev = curr;
}
data[yval] = 1;
}
END{
analyze(curr, data);
}
It directly implements the online algorithm to calculate the mean and for each distinct value of x prints this mean as well as the min/max values.
In the Gnuplot script, the column of interest is then passed to the plotCmd function which prepares the command to be executed and the output of which will be plotted with u 1:2:3:4 w yerrorbars. This syntax means that the confidence interval is stored in the 3rd/4th columns while the value itself (the mean) resides in the second column.
In total, the two scripts above produce the picture below. The confidence interval on the last point is not visible since the example data in your question contain only one record for x=40, thus the min/max values coincide with the mean.

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With