Right now I have a DF like this
Word Word2 Word3
Hello NaN NaN
My My Name NaN
Yellow Yellow Bee Yellow Bee Hive
Golden Golden Gates NaN
Yellow NaN NaN
What I was hoping for was to remove all of the NaN cells from my data frame. So in the end, it would look like this, where 'Yellow Bee Hive' has moved to row 1 (similarly to what happens when you delete cells from a column in excel) :
Word Word2 Word3
1 Hello My Name Yellow Bee Hive
2 My Yellow Bee
3 Yellow Golden Gates
4 Golden
5 Yellow
Unfortunately, neither of these work because they delete the Entire ROW!
df = df[pd.notnull(df['Word','Word2','Word3'])]
or
df = df.dropna()
Anyone have any suggestions? Should I reindex the table?
I think you can use this:
df = df.apply(lambda x: pd.Series(x.dropna().values))
For example:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Word':['Hello', 'My', 'Yellow', 'Golden', 'Yellow'],
'Word2':[np.nan, 'My Name', 'Yellow Bee', 'Golden Gates', np.nan],
'Word3':[np.nan, np.nan, 'Yellow Bee Hive', np.nan, np.nan]
})
print(df)
Initial dataframe:
Word Word2 Word3
0 Hello NaN NaN
1 My My Name NaN
2 Yellow Yellow Bee Yellow Bee Hive
3 Golden Golden Gates NaN
4 Yellow NaN NaN
and applying this lambda function:
df = df.apply(lambda x: pd.Series(x.dropna().values))
print(df)
gives:
Word Word2 Word3
0 Hello My Name Yellow Bee Hive
1 My Yellow Bee NaN
2 Yellow Golden Gates NaN
3 Golden NaN NaN
4 Yellow NaN NaN
Then you can fill NaN values with empty strings:
df = df.fillna('')
print(df)
Word Word2 Word3
0 Hello My Name Yellow Bee Hive
1 My Yellow Bee
2 Yellow Golden Gates
3 Golden
4 Yellow
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With