I would like to set a value in some column for the first n rows of a pandas DataFrame.
>>> example = pd.DataFrame({'number':range(10),'name':list('aaabbbcccc')},index=range(20,0,-2)) # nontrivial index
>>> example
name number
20 a 0
18 a 1
16 a 2
14 b 3
12 b 4
10 b 5
8 c 6
6 c 7
4 c 8
2 c 9
I would like to set "number" for the first, say, 5 rows to the number 19. What I really want is to set the lowest values of "number" to that value, so I just sort first. If my index was the trivial one, I could do
example.loc[:5-1,'number'] = 19 # -1 for inclusive indexing
# or
example.ix[:5-1,'number'] = 19
But since it's not, this would produce the following artifact (where all index values up to 4 have been chosen):
>>> example
name number
20 a 19
18 a 19
16 a 19
14 b 19
12 b 19
10 b 19
8 c 19
6 c 19
4 c 19
2 c 9
Using .iloc[] would be nice, except that it doesn't accept column names.
example.iloc[:5]['number'] = 19
works but gives a SettingWithCopyWarning.
My current solution is to do:
>>> example.sort_values('number',inplace=True)
>>> example.reset_index(drop=True,inplace=True)
>>> example.ix[:5-1,'number'] = 19
>>> example
name number
0 a 19
1 a 19
2 a 19
3 b 19
4 b 19
5 b 5
6 c 6
7 c 7
8 c 8
9 c 9
And since I have to repeat this for several columns, I have to do this a few times and reset the index each time, which also costs me my index (but never mind that).
Does anyone have a better solution?
I would use .iloc as .loc might yield unexpected results if certain indexes are repeated.
example.iloc[:5, example.columns.get_loc('number')] = 19
example.loc[example.index[:5], 'number'] = 19
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With