I would like to create a dichotomous variable that tells me whether a participant gave the same response to each of 10 questions. Each row is a participant and I want to write a simple script to create this new variable/vector in my data frame. For example, if my data looks like the first 6 columns, then I'm trying to create the 7th one.
ID Item1 Item2 Item3 Item4 Item5 | AllSame
1 5 5 5 5 5 | Yes
2 1 3 3 3 2 | No
3 2 2 2 2 2 | Yes
4 5 4 5 5 5 | No
5 5 2 3 5 5 | No
I've seen solutions on this set that compare one column to another, for example here with ifelse(data$item1==data$item2,1,ifelse(dat$item1==data$item3,0,NA))
, but I have 10 columns in my actual dataset and I figure there's got to be a better way than checking all 10 against each other. I also could create a a variable that counts how many equal 1, and then do a test for if the count is the same as the number of columns, but with 7 possible responses in the data once again this is looking very unweildy and I'm hoping someone has a better solution. Thank you!
You can use the duplicated function for this: if sum(! duplicated(x[,1]))==1 returns TRUE the column contains all identical values.
The column items in a data frame in R can be accessed using: Single brackets [] , which would display them as a column. Double brackets [[]] , which would display them as a list.
To create a new variable or to transform an old variable into a new one, usually, is a simple task in R. The common function to use is newvariable <- oldvariable . Variables are always added horizontally in a data frame.
There will be many ways of doing this, but here is one
mydf <- data.frame(Item1 = c(5,1,2,5,5),
Item2 = c(5,3,2,4,2),
Item3 = c(5,3,2,5,3),
Item4 = c(5,3,2,5,5),
Item5 = c(5,3,2,5,5) )
mydf$AllSame <- rowMeans(mydf[,1:5] == mydf[,1]) == 1
which leads to
> mydf
Item1 Item2 Item3 Item4 Item5 AllSame
1 5 5 5 5 5 TRUE
2 1 3 3 3 3 FALSE
3 2 2 2 2 2 TRUE
4 5 4 5 5 5 FALSE
5 5 2 3 5 5 FALSE
And if you really must have "Yes" and "No" then use instead something like
mydf$AllSame <- ifelse(rowMeans(mydf[,1:5] == mydf[,1]) == 1, "Yes", "No")
Henry has posted a short and fast working solution that has already been accepted. I still wanted to add this alternative, which in my opinion has a slight advantage in readability:
mydf <- data.frame(Item1 = c(5,1,2,5,5),
Item2 = c(5,3,2,4,2),
Item3 = c(5,3,2,5,3),
Item4 = c(5,3,2,5,5),
Item5 = c(5,3,2,5,5) )
mydf$AllSame <- apply(mydf, 1, function(row) all(row==row[1]))
The all() functions used here has a na.rm argument which can easily be set to TRUE, if you want NAs to be neglected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With