I have a set of data which I would like a subset of. I would like the subset defined as those rows with a value for variable X which appears more than once. Variable X is a string.
So, for example, if x consisted of ('help','me,'me','with','this','this'), it would return the rows with the x values ('me','me','this,'this').
Thank you so much for your help!
The most general way to subset a data frame by rows and/or columns is the base R Extract[] function, indicated by matched square brackets instead of the usual matched parentheses. For a data frame named d the general format is d[rows, columms] .
How to subset the data frame (DataFrame) by column value and name in R? By using R base df[] notation, or subset() you can easily subset the R Data Frame (data. frame) by column value or by column name.
To specify multiple variables, separate adjacent variables by a comma, and enclose the list within the standard R combine function, c . A single variable may be replaced by a range of consecutive variables indicated by a colon, which separates the first and last variables of the range.
Something like this should work:
x <- c('help','me','me','with','this','this')
x[duplicated(x, fromLast=TRUE) | duplicated(x)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With