I'm trying to create a CDF but at the end of the graph, there is a vertical line, shown below:
I've read that his is because matplotlib uses the end of the bins to draw the vertical lines, which makes sense, so I added into my code as:
bins = sorted(X) + [np.inf]
where X is the data set I'm using and set the bin size to this when plotting:
plt.hist(X, bins = bins, cumulative = True, histtype = 'step', color = 'b')
This does remove the line at the end and produce the desired effect, however when I normalise this graph now it produces an error:
ymin = max(ymin*0.9, minimum) if not input_empty else minimum
UnboundLocalError: local variable 'ymin' referenced before assignment
Is there anyway to either normalise the data with
bins = sorted(X) + [np.inf]
in my code or is there another way to remove the line on the graph?
The method axhline and axvline are used to draw lines at the axes coordinate. In this coordinate system, coordinate for the bottom left point is (0,0), while the coordinate for the top right point is (1,1), regardless of the data range of your plot. Both the parameter xmin and xmax are in the range [0,1].
An alternative way to plot a CDF would be as follows (in my example, X
is a bunch of samples drawn from the unit normal):
import numpy as np
import matplotlib.pyplot as plt
X = np.random.randn(10000)
n = np.arange(1,len(X)+1) / np.float(len(X))
Xs = np.sort(X)
fig, ax = plt.subplots()
ax.step(Xs,n)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With