Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does the `overwrite` parameter in Pandas DataFrame.update() function do?

I read this documentation but I do not understand what the overwrite option actually does to the update process. I tested with a few cases but in each whether or not I set overwrite to True or False makes no difference. Can someone give an example where it does make a difference?

like image 880
Zhang18 Avatar asked Feb 08 '18 23:02

Zhang18


People also ask

How do you update data in a DataFrame in Python?

Pandas DataFrame update() Method The update() method updates a DataFrame with elements from another similar object (like another DataFrame). Note: this method does NOT return a new DataFrame. The updating is done to the original DataFrame.

What does replace do in pandas?

The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.

How do I update a pandas Dataframe?

Pandas DataFrame.update (~) method replaces the values in the source DataFrame using non- NaN values from another DataFrame. The update is done in-place, which means that the source DataFrame will be directly modified.

How does The Dataframe update work?

The update is done in-place, which means that the source DataFrame will be directly modified. The Series or DataFrame that holds the values to update the source DataFrame.

How to update the value of a row in Python Dataframe?

Python loc() method can also be used to update the value of a row with respect to columns by providing the labels of the columns and the index of the rows. Syntax: dataframe.loc[row index,['column-names']] = value

What is a data frame in Python pandas?

In Python programming language, we come across this module called Pandas which offers us a data structure called a data frame. A data frame stores data in it in the form of rows and columns. Thus, it can be considered as a matrix and is useful while analyzing the data.


1 Answers

The difference is that when overwrite is set to false, it will only fill in missing values in the DataFrame that update was called on.

Based on the example from the link you supplied (using the default value overwrite=True):

df = pd.DataFrame({'A': [1, 2,3], 'B': [400, None, 600]})
new_df = pd.DataFrame({'B': [4, 5, 6],  'C': [7, 8, 9]})
df.update(new_df)

yields:

   A    B
0  1  4.0
1  2  5.0
2  3  6.0

whereas df.update(new_df, overwrite=False) yields:

   A      B
0  1  400.0
1  2    5.0
2  3  600.0
like image 156
Joshua R. Avatar answered Sep 29 '22 15:09

Joshua R.