I have the following dataframe df:
print(df)
Food Taste
0 Apple NaN
1 Banana NaN
2 Candy NaN
3 Milk NaN
4 Bread NaN
5 Strawberry NaN
I am trying to replace values in a range of rows using iloc:
df.Taste.iloc[0:2] = 'good'
df.Taste.iloc[2:6] = 'bad'
But it returned the following SettingWithCopyWarning message:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
So, I found this Stackoverflow page and tried this:
df.iloc[0:2, 'Taste'] = 'good'
df.iloc[2:6, 'Taste'] = 'bad'
Unfortunately, it returned the following error:
ValueError: Can only index by location with a [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array]
What would be the proper way to use iloc in this situation? Also, is there a way to combine these two lines above?
Using the iloc method in python, we can easily retrieve any particular value from a row or column by using index values. The iloc function in python takes two optional parameters i.e. row number(s) and column number(s). We can only pass integer type values as parameter(s) in the iloc function in python.
When it comes to selecting rows and columns of a pandas DataFrame, loc and iloc are two commonly used functions. Here is the subtle difference between the two functions: loc selects rows and columns with specific labels. iloc selects rows and columns at specific integer positions.
df. iloc[:, 2] selects the second column but df. iloc[:, :2] or explicitly df. iloc[:, 0:2] selects the columns until (excluding) the second column. It's the same as Python's slices.
iloc[] is an index-based to select rows and/or columns in pandas. It accepts a single index, multiple indexes from the list, indexes by a range, and many more. One of the main advantages of DataFrame is its ease of use.
You can use Index.get_loc
for position of column Taste
, because DataFrame.iloc
select by positions:
#return second position (python counts from 0, so 1)
print (df.columns.get_loc('Taste'))
1
df.iloc[0:2, df.columns.get_loc('Taste')] = 'good'
df.iloc[2:6, df.columns.get_loc('Taste')] = 'bad'
print (df)
Food Taste
0 Apple good
1 Banana good
2 Candy bad
3 Milk bad
4 Bread bad
5 Strawberry bad
Possible solution with ix
is not recommended because deprecate ix in next version of pandas:
df.ix[0:2, 'Taste'] = 'good'
df.ix[2:6, 'Taste'] = 'bad'
print (df)
Food Taste
0 Apple good
1 Banana good
2 Candy bad
3 Milk bad
4 Bread bad
5 Strawberry bad
.iloc uses integer location, whereas .loc uses name. Both options also take both row AND column identifiers (for DataFrames). Your inital code didn't work because you didn't specify within the .iloc call which column you're selecting. The second code line you tried didn't work because you mixed integer location with column name, and .iloc only accepts integer location. If you don't know the column integer location, you can use Index.get_loc
in place as suggested above. Otherwise, use the integer position, in this case 1.
df.iloc[0:2, df.columns.get_loc('Taste')] = 'good'
df.iloc[2:6, df.columns.get_loc('Taste')] = 'bad'
is equal to:
df.iloc[0:2, 1] = 'good'
df.iloc[2:6, 1] = 'bad'
in this particular situation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With