Sometimes I end up with a series of tuples/lists when using Pandas. This is common when, for example, doing a group-by and passing a function that has multiple return values:
import numpy as np from scipy import stats df = pd.DataFrame(dict(x=np.random.randn(100), y=np.repeat(list("abcd"), 25))) out = df.groupby("y").x.apply(stats.ttest_1samp, 0) print out y a (1.3066417476, 0.203717485506) b (0.0801133382517, 0.936811414675) c (1.55784329113, 0.132360504653) d (0.267999459642, 0.790989680709) dtype: object
What is the correct way to "unpack" this structure so that I get a DataFrame with two columns?
A related question is how I can unpack either this structure or the resulting dataframe into two Series/array objects. This almost works:
t, p = zip(*out)
but it t
is
(array(1.3066417475999257), array(0.08011333825171714), array(1.557843291126335), array(0.267999459641651))
and one needs to take the extra step of squeezing it.
To split a column of tuples in a Python Pandas data frame, we can use the column's tolist method. We create the df data frame with the pd. DataFrame class and a dictionary. Then we create a new data frame from df by using df['b'].
split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.
Accessing Element from Series with Position In order to access the series element refers to the index number. Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation.
maybe this is most strightforward (most pythonic i guess):
out.apply(pd.Series)
if you would want to rename the columns to something more meaningful, than:
out.columns=['Kstats','Pvalue']
if you do not want the default name for the index:
out.index.name=None
maybe:
>>> pd.DataFrame(out.tolist(), columns=['out-1','out-2'], index=out.index) out-1 out-2 y a -1.9153853424536496 0.067433 b 1.277561889173181 0.213624 c 0.062021492729736116 0.951059 d 0.3036745009819999 0.763993 [4 rows x 2 columns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With