Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: index into dataframe by multiple column values

I'm a beginner to R and am having trouble indexing into a dataframe using a vector of column values.

I want to select all the rows from 2 participants.

data is the data frame. participant is a column

data[data$participant == c(8, 10),])

I thought this should give me all the rows from both participants 8 and 10, but instead it is giving me half of the rows from participant 8 and half from participant 10. In other words,

dim(data[data$participant == c(8, 10),]) is the same as dim(data[data$participant == 8,]) or dim(data[data$participant == 10,]) rather than double.

The problem seems to be with the syntax of indexing these multiple column types: data$participant == c(8, 10)

I'd be grateful for any tips on how to do this (without doing each participant separately)! Thank you!

like image 763
maia-sh Avatar asked Sep 10 '25 14:09

maia-sh


1 Answers

For multiple values, use %in% to get a logical vector.

data[data$participant %in% c(8, 10),]

When we are using == with c(8,10), it is recycling the 8 and 10 i.e. 8,10, 8, 10, 8, 10... etc to the length of 'participant' column. So, if the 1st value in participant is 8, it will return TRUE, but if the 2nd is 8, it will become FALSE as the corresponding element will be 10.

like image 71
akrun Avatar answered Sep 12 '25 03:09

akrun