I have a large pandas dataframe consisting of many rows and columns containing binary data like '0|1', '0|0','1|1','1|0' which i would like to split either in 2 dataframes, and/or expand so that this (both are useful to me):
a b c d
rowa 1|0 0|1 0|1 1|0
rowb 0|1 0|0 0|0 0|1
rowc 0|1 1|0 1|0 0|1
becomes
a b c d
rowa1 1 0 0 1
rowa2 0 1 1 0
rowb1 0 0 0 0
rowb2 1 0 0 1
rowc1 0 1 1 0
rowc2 1 0 0 1
and/or
df1: a b c d
rowa 1 0 0 1
rowb 0 0 0 0
rowc 0 1 1 0
df2: a b c d
rowa 0 1 1 0
rowb 1 0 0 1
rowc 1 0 0 1
currently i'm trying to do something like the following, but believe this is not very effective, any guidance would be helpful.
Atmp_dict=defaultdict(list)
Btmp_dict=defaultdict(list)
for index,row in df.iterrows():
for columnname in list(df.columns.values):
Atmp_dict[columnname].append(row[columnname].split('|')[0])
Btmp_dict[columnname].append(row[columnname].split('|')[1])
user2734178 is close, but his or her answer has some issues. Here is a slight variation that works
import pandas as pd
df1 = pd.DataFrame()
df2 = pd.DataFrame()
# df is your original DataFrame
for col in df.columns:
df1[col] = df[col].apply(lambda x: x.split('|')[0])
df2[col] = df[col].apply(lambda x: x.split('|')[1])
Here is another option that is slightly more elegant. Replace the loop with:
for col in df.columns:
df1[col] = df[col].str.extract("(\d)\|")
df2[col] = df[col].str.extract("\|(\d)")
This is pretty compact, but it seems like there should be an even easier and more compact way.
df1 = df.applymap( lambda x: str(x)[0] )
df2 = df.applymap( lambda x: str(x)[2] )
Or loop over the columns as in the other answers. I don't think it matters. Note that because the question specified binary data, it is OK (and simpler) to just do str[0] and str[2] rather than using split or extract.
Or you could do this, which seems almost silly, but there's nothing actually wrong with it and it is fairly compact.
df1 = df.stack().str[0].unstack()
df2 = df.stack().str[2].unstack()
stack just converts it to a series so you can use str and then unstack converts it back to a dataframe.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With