Remove All Duplicate Rows from Pandas DataFrame You can set 'keep=False' in the drop_duplicates() function to remove all the duplicate rows. For E.x, df. drop_duplicates(keep=False) .
The subset command is not necessary. Just use data frame indexing
studentdata[studentdata$Drink == 'water',]
Read the warning from ?subset
This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like ‘[’, and in particular the non-standard evaluation of argument ‘subset’ can have unanticipated consequences.
Try this:
subset(studentdata, Drink=='water')
that should do it.
Thought I'd update this with a dplyr
solution
library(dplyr)
filter(studentdata, Drink == "water")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With