I'm still learning many things about how to use R, however I'm facing an issue which I haven't been able to find any answers for yet.
In my dataframe ("data"), the rows are for each participant and for each participants' trials on a given task. The columns contain different information about these participants. It looks a little bit like this:
Participant Age Sex Trial.Type correct
P01 26 0 test 1
P01 26 0 test 0
P01 26 0 control 1
P02 32 1 test 1
P02 32 1 control 1
P02 32 1 demographics NA
I would like to create a new dataframe df. In this dataframe, I would like to remove all the rows that do NOT contain the string "test" in the data$Trial.Type column.
I have seen that in order to remove all the rows that contain a specific string, I could use the following function:
df <- data[-grep("test", data$Trial.Type),]
Which works great to remove all rows that contain the "test" string, but actually I would like to do the opposite, and remove all the rows except those with the "test" string (and in a more efficient way than running the function above for each non "test" strings).
I hope I was clear enough and I followed the rules, it's my first post on StackOverflow
df <- data[grep("test", data$Trial.Type),]
grep
returns the indices of every match of the pattern, in your case "test"
. When you use the negatived indices you effectively exclude the matches (see In R, what does a negative index do?), and using them as they come (i.e. positive indices) is the same as only returning the matches, excluding everything else.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With