I need to calculate cohen's d to determine the effect size of an experiment. Is there any implementation in a sound library I could use? If not, what would be a good implementation?

In Python 2.7, you can use <code>numpy</code> with a couple of caveats, as I discovered while adapting Bengt's answer from Python 3.4. <ol> <li>Ensure division always returns float with: <code>from __future__ import division</code> </li> <li>Specify the division argument on the variance with <code>ddof=1</code> into the <code>std</code> function , i.e. <code>numpy.std(c0, ddof=1)</code>. numpy's standard deviation default behaviour is to divide by <code>n</code>, whereas with <code>ddof=1</code> it will divide by <code>n-1</code>.</li> </ol> Code <pre class="prettyprint"><code>from __future__ import division #Ensure division returns float from numpy import mean, std # version >= 1.7.1 && <= 1.9.1 from math import sqrt import sys def cohen_d(x,y): return (mean(x) - mean(y)) / sqrt((std(x, ddof=1) ** 2 + std(y, ddof=1) ** 2) / 2.0) if __name__ == "__main__": # test conditions c0 = [2, 4, 7, 3, 7, 35, 8, 9] c1 = [i * 2 for i in c0] print(cohen_d(c0,c1)) </code></pre> Output will then be: <pre class="prettyprint"><code>-0.556767952265 </code></pre>

How to calculate cohen's d in Python?

3 Answers

The above implementation is correct in the special case that the two groups have equal size. A more general solution based on the formulas found at Wikipedia and in Robert Coe's article is the 2nd method shown below. Be aware that the denominator is the pooled standard deviation which is generally only appropriate if the population standard deviation is equal for both groups:

from numpy import std, mean, sqrt

#correct if the population S.D. is expected to be equal for the two groups.
def cohen_d(x,y):
    nx = len(x)
    ny = len(y)
    dof = nx + ny - 2
    return (mean(x) - mean(y)) / sqrt(((nx-1)*std(x, ddof=1) ** 2 + (ny-1)*std(y, ddof=1) ** 2) / dof)

#dummy data
x = [2,4,7,3,7,35,8,9]
y = [i*2 for i in x]
# extra element so that two group sizes are not equal.
x.append(10)

#correct only if nx=ny
d = (mean(x) - mean(y)) / sqrt((std(x, ddof=1) ** 2 + std(y, ddof=1) ** 2) / 2.0)
print ("d by the 1st method = " + str(d))
if (len(x) != len(y)):
    print("The first method is incorrect because nx is not equal to ny.")

#correct for more general case including nx !=ny
print ("d by the more general 2nd method = " + str(cohen_d(x,y)))

Output will be:

d by the 1st method = -0.559662109472 The first method is incorrect because nx is not equal to ny. d by the more general 2nd method = -0.572015604666

answered Oct 10 '22 12:10

skynaut

Since Python3.4, you can use the statistics module for calculating spread and average metrics. With that, Cohen's d can be calculated easily:

from statistics import mean, stdev
from math import sqrt

# test conditions
c0 = [2, 4, 7, 3, 7, 35, 8, 9]
c1 = [i * 2 for i in c0]

cohens_d = (mean(c0) - mean(c1)) / (sqrt((stdev(c0) ** 2 + stdev(c1) ** 2) / 2))

print(cohens_d)

Output:

-0.5567679522645598

So we observe a medium effect.

answered Oct 10 '22 13:10

Bengt

In Python 2.7, you can use numpy with a couple of caveats, as I discovered while adapting Bengt's answer from Python 3.4.

Ensure division always returns float with: from __future__ import division
Specify the division argument on the variance with ddof=1 into the std function , i.e. numpy.std(c0, ddof=1). numpy's standard deviation default behaviour is to divide by n, whereas with ddof=1 it will divide by n-1.

Code

from __future__ import division #Ensure division returns float
from numpy import mean, std # version >= 1.7.1 && <= 1.9.1
from math import sqrt
import sys


def cohen_d(x,y):
        return (mean(x) - mean(y)) / sqrt((std(x, ddof=1) ** 2 + std(y, ddof=1) ** 2) / 2.0)

if __name__ == "__main__":                
        # test conditions
        c0 = [2, 4, 7, 3, 7, 35, 8, 9]
        c1 = [i * 2 for i in c0]
        print(cohen_d(c0,c1))

Output will then be:

-0.556767952265

answered Oct 10 '22 13:10

pds

Related questions
                            
                                Create a new list from a list when a certain condition is met
                            
                                Python: repr vs backquote
                            
                                Is there a way to identify an inherited method in Python?
                            
                                Automatic detection of display availability with matplotlib
                            
                                Google Calendar API v3 - How to obtain a refresh token (Python)
                            
                                matplotlib chart - creating horizontal bar chart
                            
                                Insert static files literally into Jinja templates without parsing them
                            
                                Why map(print, a_list) doesn't work?
                            
                                How to check if given variable exist in jinja2 template?
                            
                                How to use Flask-Security register view?
                            
                                matplotlib hooking in to home/back/forward button events
                            
                                Finding clusters of numbers in a list
                            
                                Is Python dict an Object?
                            
                                How to iterate over worksheets in workbook, openpyxl
                            
                                Compare two dates in python and ignoring microseconds
                            
                                Emacs 24.3 python: Can't guess python-indent-offset, using defaults 4
                            
                                How to update an image on a Canvas?
                            
                                How can I use scipy.ndimage.interpolation.affine_transform to rotate an image about its centre?
                            
                                Insert Null into SQLite3 in Python
                            
                                Get reference to the current exception

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to calculate cohen's d in Python?

Tags:

python

python-3.x

statistics

Bengt

People also ask

3 Answers

skynaut

Bengt

pds

Recent Activity

Donate For Us