Say I have the following pandas dataframe: <pre class="prettyprint"><code>df = pd.DataFrame([[3, 2, np.nan, 0], [5, 4, 2, np.nan], [7, np.nan, np.nan, 5], [9, 3, np.nan, 4]], columns=list('ABCD')) </code></pre> which returns this: <pre class="prettyprint"><code> A B C D 0 3 2.0 NaN 0.0 1 5 4.0 2.0 NaN 2 7 NaN NaN 5.0 3 9 3.0 NaN 4.0 </code></pre> I'd like that if a np.nan is found, that the value is replaced by a value in the A column. So that would mean the result to be this: <pre class="prettyprint"><code> A B C D 0 3 2.0 3.0 0.0 1 5 4.0 2.0 5.0 2 7 7.0 7.0 5.0 3 9 3.0 9.0 4.0 </code></pre> I've tried multiple things, but I could not get anything to work. Can anyone help?

Here is necessary double transpose: <pre class="prettyprint"><code>cols = ['B','C', 'D'] df[cols] = df[cols].T.fillna(df['A']).T print(df) A B C D 0 3 2.0 3.0 0.0 1 5 4.0 2.0 5.0 2 7 7.0 7.0 5.0 3 9 3.0 9.0 4.0 </code></pre> because: <pre class="prettyprint"><code>df[cols] = df[cols].fillna(df['A'], axis=1) print(df) </code></pre> <blockquote> NotImplementedError: Currently only can fill with dict/Series column by column </blockquote> Another solution with <code>numpy.where</code> and broadcasting column <code>A</code>: <pre class="prettyprint"><code>df = pd.DataFrame(np.where(df.isnull(), df['A'].values[:, None], df), index=df.index, columns=df.columns) print (df) A B C D 0 3.0 2.0 3.0 0.0 1 5.0 4.0 2.0 5.0 2 7.0 7.0 7.0 5.0 3 9.0 3.0 9.0 4.0 </code></pre> Thank you @pir for another solution: <pre class="prettyprint"><code>df = pd.DataFrame(np.where(df.isnull(), df[['A']], df), index=df.index, columns=df.columns) </code></pre>

Currently, <code>fillna</code> doesn't allow for broadcasting a series across columns while aligning the indices. <h3><code>pandas.DataFrame.mask</code></h3> This functions exactly like what we'd want <code>fillna</code> to do. Finds the the nulls, fills it in with <code>df.A</code> along <code>axis=0</code> <pre class="prettyprint"><code>df.mask(df.isna(), df.A, axis=0) A B C D 0 3 2.0 3.0 0.0 1 5 4.0 2.0 5.0 2 7 7.0 7.0 5.0 3 9 3.0 9.0 4.0 </code></pre> <hr> <h3> <code>pandas.DataFrame.fillna</code> using a dictionary</h3> However, you can pass a dictionary to <code>fillna</code> that tells it what to do for each column. <pre class="prettyprint"><code>df.fillna({k: df.A for k in df}) A B C D 0 3 2.0 3.0 0.0 1 5 4.0 2.0 5.0 2 7 7.0 7.0 5.0 3 9 3.0 9.0 4.0 </code></pre>

DO <code>fillna</code> with <code>reindex</code> <pre class="prettyprint"><code>df.fillna(df[['A']].reindex(columns=df.columns).ffill(1)) Out[20]: A B C D 0 3 2.0 3.0 0.0 1 5 4.0 2.0 5.0 2 7 7.0 7.0 5.0 3 9 3.0 9.0 4.0 </code></pre> Or <code>combine_first</code> <pre class="prettyprint"><code>df.combine_first(df.fillna(0).add(df.A,0)) Out[35]: A B C D 0 3 2.0 3.0 0.0 1 5 4.0 2.0 5.0 2 7 7.0 7.0 5.0 3 9 3.0 9.0 4.0 </code></pre>

Replacing empty values in a DataFrame with value of a column

Tags:

python

pandas

Say I have the following pandas dataframe:

df = pd.DataFrame([[3, 2, np.nan, 0],
                    [5, 4, 2, np.nan],
                    [7, np.nan, np.nan, 5],
                    [9, 3, np.nan, 4]],
                    columns=list('ABCD'))

which returns this:

   A    B    C    D
0  3  2.0  NaN  0.0
1  5  4.0  2.0  NaN
2  7  NaN  NaN  5.0
3  9  3.0  NaN  4.0

I'd like that if a np.nan is found, that the value is replaced by a value in the A column. So that would mean the result to be this:

   A    B    C    D
0  3  2.0  3.0  0.0
1  5  4.0  2.0  5.0
2  7  7.0  7.0  5.0
3  9  3.0  9.0  4.0

I've tried multiple things, but I could not get anything to work. Can anyone help?

946

asked Nov 02 '18 14:11

user498537

3 Answers

Here is necessary double transpose:

cols = ['B','C', 'D']
df[cols] = df[cols].T.fillna(df['A']).T
print(df)
   A    B    C    D
0  3  2.0  3.0  0.0
1  5  4.0  2.0  5.0
2  7  7.0  7.0  5.0
3  9  3.0  9.0  4.0

because:

df[cols] = df[cols].fillna(df['A'], axis=1)
print(df)

NotImplementedError: Currently only can fill with dict/Series column by column

Another solution with numpy.where and broadcasting column A:

df = pd.DataFrame(np.where(df.isnull(), df['A'].values[:, None], df), 
                  index=df.index, 
                  columns=df.columns)
print (df)
     A    B    C    D
0  3.0  2.0  3.0  0.0
1  5.0  4.0  2.0  5.0
2  7.0  7.0  7.0  5.0
3  9.0  3.0  9.0  4.0

Thank you @pir for another solution:

df = pd.DataFrame(np.where(df.isnull(), df[['A']], df), 
                  index=df.index, 
                  columns=df.columns)

answered Oct 01 '22 19:10

jezrael

Currently, fillna doesn't allow for broadcasting a series across columns while aligning the indices.

`pandas.DataFrame.mask`

This functions exactly like what we'd want fillna to do. Finds the the nulls, fills it in with df.A along axis=0

df.mask(df.isna(), df.A, axis=0)

   A    B    C    D
0  3  2.0  3.0  0.0
1  5  4.0  2.0  5.0
2  7  7.0  7.0  5.0
3  9  3.0  9.0  4.0

`pandas.DataFrame.fillna` using a dictionary

However, you can pass a dictionary to fillna that tells it what to do for each column.

df.fillna({k: df.A for k in df})

   A    B    C    D
0  3  2.0  3.0  0.0
1  5  4.0  2.0  5.0
2  7  7.0  7.0  5.0
3  9  3.0  9.0  4.0

answered Oct 01 '22 19:10

piRSquared

DO fillna with reindex

df.fillna(df[['A']].reindex(columns=df.columns).ffill(1))
Out[20]: 
   A    B    C    D
0  3  2.0  3.0  0.0
1  5  4.0  2.0  5.0
2  7  7.0  7.0  5.0
3  9  3.0  9.0  4.0

Or combine_first

df.combine_first(df.fillna(0).add(df.A,0))
Out[35]: 
   A    B    C    D
0  3  2.0  3.0  0.0
1  5  4.0  2.0  5.0
2  7  7.0  7.0  5.0
3  9  3.0  9.0  4.0

answered Oct 01 '22 17:10

BENY

Related questions
                            
                                Python: ValueError: setting an array element with a sequence
                            
                                What does iter() do to list?
                            
                                How to coerce string to datetime in Python Cerberus?
                            
                                how to identify highest value key in nested dictionary? [duplicate]
                            
                                pandas.DataFrame.describe() gives no output in .py script
                            
                                Django Built-in Login System - accounts/profile/ not found
                            
                                Python sort dictionary by descending values and then by keys alphabetically
                            
                                itertools group by multiple keys
                            
                                NamedTuple to Dataframe
                            
                                AttributeError: 'list' object has no attribute 'click' using Selenium and Python
                            
                                Import image in python
                            
                                Running flask on port 80 in linux [duplicate]
                            
                                pip install producing "Could not find a version that satisfies the requirement" [duplicate]
                            
                                Get inner-most elements from triple nested list Python
                            
                                zip()-like built-in function filling unequal lengths from left with None value
                            
                                Pandas - Insert blank row for each group in pandas
                            
                                Can't import google.cloud.vision
                            
                                iter() returned non-iterator of type 'dict_items'
                            
                                OpenCV Python Scripts Mac "aborts"
                            
                                How to convert all columns in Pandas DataFrame to 'object' while ignoring NaN?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Replacing empty values in a DataFrame with value of a column

Tags:

python

pandas

user498537

People also ask

3 Answers

jezrael

`pandas.DataFrame.mask`

`pandas.DataFrame.fillna` using a dictionary

piRSquared

BENY

Recent Activity

Donate For Us

Replacing empty values in a DataFrame with value of a column

Tags:

python

pandas

user498537

People also ask

3 Answers

jezrael

pandas.DataFrame.mask

pandas.DataFrame.fillna using a dictionary

piRSquared

BENY

Related questions

Recent Activity

Donate For Us

`pandas.DataFrame.mask`

`pandas.DataFrame.fillna` using a dictionary