I find the result is a little bit random. Sometimes it's a copy sometimes it's a view. For example:
df = pd.DataFrame([{'name':'Marry', 'age':21},{'name':'John','age':24}],index=['student1','student2']) df age name student1 21 Marry student2 24 John
Now, Let me try to modify it a little bit.
df2= df.loc['student1'] df2 [0] = 23 df age name student1 21 Marry student2 24 John
As you can see, nothing changed. df2 is a copy. However, if I add another student into the dataframe...
df.loc['student3'] = ['old','Tom'] df age name student1 21 Marry student2 24 John student3 old Tom
Try to change the age again..
df3=df.loc['student1'] df3[0]=33 df age name student1 33 Marry student2 24 John student3 old Tom
Now df3 suddenly became a view. What is going on? I guess the value 'old' is the key?
loc[mask] returns a new DataFrame with a copy of the data from df . Then df.
@Qiyu with multiple dtypes yes.
The iloc() function in python is defined in the Pandas module that helps us to select a specific row or column from the data set. Using the iloc method in python, we can easily retrieve any particular value from a row or column by using index values.
The key concepts that are connected to the SettingWithCopyWarning are views and copies. Some operations in pandas (and numpy as well) will return views of the original data, while other copies.
You are starting with a DataFrame that has two columns with two different dtypes:
df.dtypes Out: age int64 name object dtype: object
Since different dtypes are stored in different numpy arrays under the hood, you have two different blocks for them:
df.blocks Out: {'int64': age student1 21 student2 24, 'object': name student1 Marry student2 John}
If you attempt to slice the first row of this DataFrame, it has to get one value from each different block which makes it necessary to create a copy.
df2.is_copy Out[40]: <weakref at 0x7fc4487a9228; to 'DataFrame' at 0x7fc4488f9dd8>
In the second attempt, you are changing the dtypes. Since 'old' cannot be stored in an integer array, it casts the Series as an object Series.
df.loc['student3'] = ['old','Tom'] df.dtypes Out: age object name object dtype: object
Now all data for this DataFrame is stored in a single block (and in a single numpy array):
df.blocks Out: {'object': age name student1 21 Marry student2 24 John student3 old Tom}
At this step, slicing the first row can be done on the numpy array without creating a copy, so it returns a view.
df3._is_view Out: True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With