Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot cdf in matplotlib in Python?

Tags:

I have a disordered list named d that looks like:

[0.0000, 123.9877,0.0000,9870.9876, ...] 

I just simply want to plot a cdf graph based on this list by using Matplotlib in Python. But don't know if there's any function I can use

d = [] d_sorted = [] for line in fd.readlines():     (addr, videoid, userag, usertp, timeinterval) = line.split()     d.append(float(timeinterval))  d_sorted = sorted(d)  class discrete_cdf:     def __init__(data):         self._data = data # must be sorted         self._data_len = float(len(data))      def __call__(point):         return (len(self._data[:bisect_left(self._data, point)]) /                 self._data_len)  cdf = discrete_cdf(d_sorted) xvalues = range(0, max(d_sorted)) yvalues = [cdf(point) for point in xvalues] plt.plot(xvalues, yvalues) 

Now I am using this code, but the error message is :

Traceback (most recent call last): File "hitratioparea_0117.py", line 43, in <module> cdf = discrete_cdf(d_sorted) TypeError: __init__() takes exactly 1 argument (2 given) 
like image 976
manxing Avatar asked Feb 21 '12 13:02

manxing


People also ask

What does a CDF plot show?

The Cumulative Distribution Function (CDF) plot is a lin-lin plot with data overlay and confidence limits. It shows the cumulative density of any data set over time (i.e., Probability vs. size).


1 Answers

As mentioned, cumsum from numpy works well. Make sure that your data is a proper PDF (ie. sums to one), otherwise the CDF won't end at unity as it should. Here is a minimal working example:

import numpy as np from pylab import *  # Create some test data dx = 0.01 X  = np.arange(-2, 2, dx) Y  = exp(-X ** 2)  # Normalize the data to a proper PDF Y /= (dx * Y).sum()  # Compute the CDF CY = np.cumsum(Y * dx)  # Plot both plot(X, Y) plot(X, CY, 'r--')  show() 

enter image description here

like image 119
Hooked Avatar answered Oct 05 '22 23:10

Hooked