I have a pandas Dataframe like below:
UserId ProductId Quantity
1 1 6
1 4 1
1 7 3
2 4 2
3 2 7
3 1 2
Now, I want to randomly select the 20% of rows of this DataFrame, using df.sample(n), and change the value of the Quantity column of these rows to zero. I would also like to keep the indexes of the altered rows. So the resulting DataFrame would be:
UserId ProductId Quantity
1 1 6
1 4 1
1 7 3
2 4 0
3 2 7
3 1 0
and I would like to keep in a list that the rows 3 and 5 have been altered. How can I achieve that?
To change the index values we need to use the set_index method which is available in pandas allows specifying the indexes. where, inplace parameter accepts True or False, which specifies that change in index is permanent or temporary. True indicates that change is Permanent.
By using update
dfupdate=df.sample(2)
dfupdate.Quantity=0
df.update(dfupdate)
update_list = dfupdate.index.tolist() # from cᴏʟᴅsᴘᴇᴇᴅ :)
df
Out[44]:
UserId ProductId Quantity
0 1.0 1.0 6.0
1 1.0 4.0 0.0
2 1.0 7.0 3.0
3 2.0 4.0 0.0
4 3.0 2.0 7.0
5 3.0 1.0 2.0
Using loc
to change the data i.e
change = df.sample(2).index
df.loc[change,'Quantity'] = 0
Output:
UserId ProductId Quantity 0 1 1 0 1 1 4 1 2 1 7 3 3 2 4 0 4 3 2 7 5 3 1 2
change.tolist() : [3, 0]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With