Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Updating a pandas DataFrame row with a dictionary

Tags:

python

pandas

I've found a behavior in pandas DataFrames that I don't understand.

df = pd.DataFrame(np.random.randint(1, 10, (3, 3)), index=['one', 'one', 'two'], columns=['col1', 'col2', 'col3'])
new_data = pd.Series({'col1': 'new', 'col2': 'new', 'col3': 'new'})
df.iloc[0] = new_data
# resulting df looks like:

#       col1    col2    col3
#one    new     new     new
#one    9       6       1
#two    8       3       7

But if I try to add a dictionary instead, I get this:

new_data = {'col1': 'new', 'col2': 'new', 'col3': 'new'}
df.iloc[0] = new_data
#
#         col1  col2    col3
#one      col2  col3    col1
#one      2     1       7
#two      5     8       6

Why is this happening? In the process of writing up this question, I realized that most likely df.loc is only taking the keys from new_data, which also explains why the values are out of order. But, again, why is this the case? If I try to create a DataFrame from a dictionary, it handles the keys as if they were columns:

pd.DataFrame([new_data])

#    col1   col2    col3
#0  new     new     new

Why is that not the default behavior in df.loc?

like image 510
J Jones Avatar asked Jul 14 '16 19:07

J Jones


People also ask

How to convert pandas Dataframe to dictionary in Python?

How to Convert Pandas DataFrame to a Dictionary. Python / November 25, 2020. The following syntax can be used to convert Pandas DataFrame to a dictionary: my_dictionary = df.to_dict () Next, you’ll see the complete steps to convert a DataFrame to a dictionary.

How to update the value of a row in Python Dataframe?

Python loc() method can also be used to update the value of a row with respect to columns by providing the labels of the columns and the index of the rows. Syntax: dataframe.loc[row index,['column-names']] = value

How do you update a dictionary in Python?

In Python Dictionary, update () method updates the dictionary with the elements from the another dictionary object or from an iterable of key/value pairs. Syntax: dict.update ([other]) Parameters: This method takes either a dictionary or an iterable object of key/value pairs (generally tuples) as parameters.

What is a data frame in Python pandas?

In Python programming language, we come across this module called Pandas which offers us a data structure called a data frame. A data frame stores data in it in the form of rows and columns. Thus, it can be considered as a matrix and is useful while analyzing the data.


1 Answers

It's the difference between how a dictionary iterates and how a pandas series is treated.

A pandas series matches it's index to columns when being assigned to a row and matches to index if being assigned to a column. After that, it assigns the value that corresponds to that matched index or column.

When an object is not a pandas object with a convenient index object to match off of, pandas will iterate through the object. A dictionary iterates through it's keys and that's why you see the dictionary keys in that rows slots. Dictionaries are not sorted and that's why you see shuffled keys in that row.

like image 167
piRSquared Avatar answered Oct 27 '22 00:10

piRSquared