Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get a subset of a dataframe which only has elements which appear in the set more than once in R

Tags:

r

I have a set of data which I would like a subset of. I would like the subset defined as those rows with a value for variable X which appears more than once. Variable X is a string.

So, for example, if x consisted of ('help','me,'me','with','this','this'), it would return the rows with the x values ('me','me','this,'this').

Thank you so much for your help!

like image 839
evt Avatar asked May 25 '11 02:05

evt


People also ask

How do I extract a subset from a DataFrame in R?

The most general way to subset a data frame by rows and/or columns is the base R Extract[] function, indicated by matched square brackets instead of the usual matched parentheses. For a data frame named d the general format is d[rows, columms] .

How do I subset a DataFrame based on column value in R?

How to subset the data frame (DataFrame) by column value and name in R? By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.

How do I subset multiple variables in R?

To specify multiple variables, separate adjacent variables by a comma, and enclose the list within the standard R combine function, c . A single variable may be replaced by a range of consecutive variables indicated by a colon, which separates the first and last variables of the range.


1 Answers

Something like this should work:

x <- c('help','me','me','with','this','this')
x[duplicated(x, fromLast=TRUE) | duplicated(x)]
like image 112
Joshua Ulrich Avatar answered Nov 16 '22 23:11

Joshua Ulrich