Perform 2 sample t-test

Tags:

I have a the mean, std dev and n of sample 1 and sample 2 - samples are taken from the sample population, but measured by different labs.

n is different for sample 1 and sample 2. I want to do a weighted (take n into account) two-tailed t-test.

I tried using the scipy.stat module by creating my numbers with np.random.normal, since it only takes data and not stat values like mean and std dev (is there any way to use these values directly). But it didn't work since the data arrays has to be of equal size.

Any help on how to get the p-value would be highly appreciated.

330

asked Mar 24 '14 13:03

Norfeldt

1 Answers

If you have the original data as arrays a and b, you can use scipy.stats.ttest_ind with the argument equal_var=False:

t, p = ttest_ind(a, b, equal_var=False)

If you have only the summary statistics of the two data sets, you can calculate the t value using scipy.stats.ttest_ind_from_stats (added to scipy in version 0.16) or from the formula (http://en.wikipedia.org/wiki/Welch%27s_t_test).

The following script shows the possibilities.

from __future__ import print_function  import numpy as np from scipy.stats import ttest_ind, ttest_ind_from_stats from scipy.special import stdtr  np.random.seed(1)  # Create sample data. a = np.random.randn(40) b = 4*np.random.randn(50)  # Use scipy.stats.ttest_ind. t, p = ttest_ind(a, b, equal_var=False) print("ttest_ind:            t = %g  p = %g" % (t, p))  # Compute the descriptive statistics of a and b. abar = a.mean() avar = a.var(ddof=1) na = a.size adof = na - 1  bbar = b.mean() bvar = b.var(ddof=1) nb = b.size bdof = nb - 1  # Use scipy.stats.ttest_ind_from_stats. t2, p2 = ttest_ind_from_stats(abar, np.sqrt(avar), na,                               bbar, np.sqrt(bvar), nb,                               equal_var=False) print("ttest_ind_from_stats: t = %g  p = %g" % (t2, p2))  # Use the formulas directly. tf = (abar - bbar) / np.sqrt(avar/na + bvar/nb) dof = (avar/na + bvar/nb)**2 / (avar**2/(na**2*adof) + bvar**2/(nb**2*bdof)) pf = 2*stdtr(dof, -np.abs(tf))  print("formula:              t = %g  p = %g" % (tf, pf))

The output:

ttest_ind:            t = -1.5827  p = 0.118873 ttest_ind_from_stats: t = -1.5827  p = 0.118873 formula:              t = -1.5827  p = 0.118873

130

answered Sep 21 '22 22:09

Warren Weckesser

Related questions
                            
                                Python: Setting an element of a Numpy matrix
                            
                                Make dictionary from list with python [duplicate]
                            
                                Django - Create A Zip of Multiple Files and Make It Downloadable [duplicate]
                            
                                PostgreSQL ILIKE query with SQLAlchemy
                            
                                Python re.sub back reference not back referencing [duplicate]
                            
                                Multilabel-indicator is not supported for confusion matrix
                            
                                No module named 'pandas._libs.tslibs.timedeltas' in PyInstaller
                            
                                How to pass several list of arguments to @click.option
                            
                                Python: Find the min, max value in a list of tuples
                            
                                How do I display current time using Python + Django?
                            
                                plotting a histogram on a Log scale with Matplotlib
                            
                                ImportError: libtk8.6.so: cannot open shared object file: No such file or directory
                            
                                How do I zip the contents of a folder using python (version 2.5)?
                            
                                How to install PyQt5 on Windows?
                            
                                How to read a file with a semi colon separator in pandas
                            
                                Send automated messages to Microsoft Teams using Python
                            
                                What is the most efficient way in Python to convert a string to all lowercase stripping out all non-ascii alpha characters?
                            
                                How do you walk through the directories using python?
                            
                                Where can I find and install the dependencies for pygame?
                            
                                How to retrieve colorbar instance from figure in matplotlib

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Perform 2 sample t-test

Tags:

python

numpy

statistics

Norfeldt

People also ask

1 Answers

Warren Weckesser

Recent Activity

Donate For Us