NaNs, how can the dataframe be transformed to remove all the NaN from the columns?import pandas as pd
import numpy as np
# dataframe from list of lists
list_of_lists = [[ 4., 7., 1., np.nan],
                 [np.nan, np.nan, 3., 3.],
                 [ 4., 9., np.nan, np.nan],
                 [np.nan, np.nan, 7., 9.],
                 [np.nan, 2., np.nan, 2.],
                 [4., np.nan, np.nan, np.nan]]
df_from_lists = pd.DataFrame(list_of_lists, columns=['A', 'B', 'C', 'D'])
# dataframe from list of dicts
list_of_dicts = [{'A': 4.0, 'B': 7.0, 'C': 1.0},
                 {'C': 3.0, 'D': 3.0},
                 {'A': 4.0, 'B': 9.0},
                 {'C': 7.0, 'D': 9.0},
                 {'B': 2.0, 'D': 2.0},
                 {'A': 4.0}]
df_from_dicts = pd.DataFrame(list_of_dicts)
     A    B    C    D
0  4.0  7.0  1.0  NaN
1  NaN  NaN  3.0  3.0
2  4.0  9.0  NaN  NaN
3  NaN  NaN  7.0  9.0
4  NaN  2.0  NaN  2.0
5  4.0  NaN  NaN  NaN
     A    B    C    D
0  4.0  7.0  1.0  3.0
1  4.0  9.0  3.0  9.0
2  4.0  2.0  7.0  2.0
                You need apply with dropna, only is necessary create numpy array and reassign Series for reset indices:
df.apply(lambda x: pd.Series(x.dropna().values))
Sample:
df = pd.DataFrame({'B':[4,np.nan,4,np.nan,np.nan,4],
                   'C':[7,np.nan,9,np.nan,2,np.nan],
                   'D':[1,3,np.nan,7,np.nan,np.nan],
                   'E':[np.nan,3,np.nan,9,2,np.nan]})
print (df)
     B    C    D    E
0  4.0  7.0  1.0  NaN
1  NaN  NaN  3.0  3.0
2  4.0  9.0  NaN  NaN
3  NaN  NaN  7.0  9.0
4  NaN  2.0  NaN  2.0
5  4.0  NaN  NaN  NaN
df1 = df.apply(lambda x: pd.Series(x.dropna().values))
print (df1)
     B    C    D    E
0  4.0  7.0  1.0  3.0
1  4.0  9.0  3.0  9.0
2  4.0  2.0  7.0  2.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With