How to format a dataframe having many NaN values, join all rows to those not starting with NaN

Question

I have the follwing df:

df = pd.DataFrame({
    'col1': [1, np.nan, np.nan, np.nan, 1, np.nan, np.nan, np.nan],
    'col2': [np.nan, 2, np.nan, np.nan, np.nan, 2, np.nan, np.nan],
    'col3': [np.nan, np.nan, 3, np.nan, np.nan, np.nan, 3, np.nan],
    'col4': [np.nan, np.nan, np.nan, 4, np.nan, np.nan, np.nan, 4]
    })

It has the following display:

    col1 col2 col3 col4
0   1.0 NaN NaN NaN
1   NaN 2.0 NaN NaN
2   NaN NaN 3.0 NaN
3   NaN NaN NaN 4.0
4   5.0 NaN NaN NaN
5   NaN 6.0 NaN NaN
6   NaN NaN 7.0 NaN
7   NaN NaN NaN 8.0

My goal is to keep all rows begining with float (not NaN value) and join to them the remaining ones.

The new_df I want to get is:

    col1 col2 col3 col4
0   1   2   3   4
4   5   6   7   8

Any help form your side will be highly appreciated (I upvote all answers).

Thank you!

jezrael · Accepted Answer

If need join first values per groups defined by non missing values in df['col1'] use:

df = (df.reset_index()
        .groupby(df['col1'].notna().cumsum())
        .first()
        .set_index('index'))

Scott Boston · Answer

Try this:

df.apply(lambda x: x.dropna().to_numpy())

Output:

   col1  col2  col3  col4
0   1.0   2.0   3.0   4.0
1   5.0   6.0   7.0   8.0

You can also, cast as integers:

df.apply(lambda x: x.dropna().to_numpy(dtype='int'))

Output:

   col1  col2  col3  col4
0     1     2     3     4
1     5     6     7     8

How to format a dataframe having many NaN values, join all rows to those not starting with NaN

Tags:

python

pandas

dataframe

Khaled DELLAL

2 Answers

jezrael

Scott Boston

Recent Activity

Donate For Us

How to format a dataframe having many NaN values, join all rows to those not starting with NaN

Tags:

python

pandas

dataframe

Khaled DELLAL

2 Answers

jezrael

Scott Boston

Related questions

Recent Activity

Donate For Us