Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering a data frame by values in a column [duplicate]

People also ask

How do you filter out duplicates in DataFrame Python?

Remove All Duplicate Rows from Pandas DataFrame You can set 'keep=False' in the drop_duplicates() function to remove all the duplicate rows. For E.x, df. drop_duplicates(keep=False) .


The subset command is not necessary. Just use data frame indexing

studentdata[studentdata$Drink == 'water',]

Read the warning from ?subset

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like ‘[’, and in particular the non-standard evaluation of argument ‘subset’ can have unanticipated consequences.


Try this:

subset(studentdata, Drink=='water')

that should do it.


Thought I'd update this with a dplyr solution

library(dplyr)    
filter(studentdata, Drink == "water")