Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gnuplot "stats" command unexpected min & "out of range" results

I’m trying to develop a histogram script. The plot itself seems correct, but I have some problems or questions:

  1. I don’t understand why the “stats” output says my data file has “out of range” points. What does that mean?
  2. The “stats” minimum value doesn’t look correct, either. From the data file, minimum = -0.0312, but stats reports 0.0.

The script:

# Gnuplot histogram from "Gnuplot In Action", 13.2.1 Jitter plots and histograms (p. 256)

# these functions put data points (x) into bins of specified width
bin(x,width)    = width*floor(x/width)

binwidth = 0.01
set boxwidth binwidth

# data file
data_file = "sorted.csv"
png_file    = "sorted.png"
datapoint_count = 14

# taking explanations from the data file
set style data linesp
set key autotitle columnheader

set datafile separator ","  # CSV format

# histogram
myTitle = "Histogram from \n" . data_file
set title myTitle
set style fill solid 1.0
set xlabel "Slack"
set mxtics
set ylabel "Count"
set yrange [0:*] # min count is always 0

set terminal png    # plot file format
set output png_file # plot to file

print "xrange="
show xrange
print "yrange="
show yrange

stats data_file using ($1)
print "STATS_records=", STATS_records
print "STATS_invalid=", STATS_invalid
print "STATS_blank=", STATS_blank
print "STATS_min=", STATS_min
print "STATS_max=", STATS_max

plot  data_file using (bin($1,binwidth)):(1) smooth frequency with boxes

The data file:

slack
-0.0312219
-0.000245109
-4.16338e-05
-2.08616e-05
-1.82986e-05
8.31485e-06
1.00136e-05
1.23084e-05
0
0.000102907
0.000123322
0.000138402
0.19044
0.190441

The output:

gnuplot sorted.gp
Could not find/open font when opening font "arial", using internal non-scalable font
xrange=

        set xrange [ * : * ] noreverse nowriteback  # (currently [-10.0000:10.0000] )

yrange=

        set yrange [ 0.00000 : * ] noreverse nowriteback  # (currently [:10.0000] )


* FILE: 
  Records:      9
  Out of range: 5
  Invalid:      0
  Blank:        0
  Data Blocks:  1

* COLUMN: 
  Mean:          0.0424
  Std Dev:       0.0792
  Sum:           0.3813
  Sum Sq.:       0.0725

  Minimum:       0.0000 [3]
  Maximum:       0.1904 [8]
  Quartile:      0.0000 
  Median:        0.0001 
  Quartile:      0.0001 

STATS_records=9.0
STATS_invalid=0.0
STATS_blank=0.0
STATS_min=0.0
STATS_max=0.190441
like image 807
Winston Smith Avatar asked Oct 20 '22 05:10

Winston Smith


1 Answers

If you give a single column to the stats command, the yrange is used to select the range from this column.

At first sight this doesn't make sense, but behaves like a plot command which has only a single column, in which case this single column is the y-value and the row number is choosen as x-value.

So, just move the set yrange part behind the stats command.

data_file = 'sorted.csv'
stats data_file using 1
show variables all
set yrange [0:*]
plot data_file ...
like image 88
Christoph Avatar answered Oct 24 '22 00:10

Christoph