Ordinarily, I'd use a pretty basic long-to-wide reshape for this, but it seems to be dropping my aggregation variables. The setup is I had a job on mechanical Turk that I performed in triplicate---I want MTurk1, Mturk2, MTurk3's answers to be their own variables in the data frame but uniquely id'd by a field I input with the job, so that I can compare them against each other with a function later.
Current Format:
> head(mturk)
AssignmentStatus Input.id Input.State Answer.Q1thing
1 Approved 134231 NY Myguess
2 Approved 134231 NY Myguess
3 Approved 134231 NY BadGuess
4 Submitted 134812 CA Another
5 Approved 134812 CA Another
6 Approved 134812 CA Another
I'd like this to become
Input.id Input.State Answer.Q1thing.1 Answer.Q1thing.2 Answer.Q1thing.3 AssignmentStatus.1 AssignmentStatus.2 AssignmentStatus.3
134231 NY Myguess Myguess BadGuess Approved Approved Approved
134812 CA Another Another Another Submitted Approved Approved
or ideally, if there's a variable that can redo column names in the operation....
Id State Answer1 Answer2 Answer3 Status1 Status2 Status3
134231 NY Myguess Myguess BadGuess Approved Approved Approved
134812 CA Another Another Another Submitted Approved Approved
dat <- reshape(mturk, timevar="Answer.Q1thing", idvar=c("Input.id", "Input.state"), direction="wide")
This seems to be failing because most reshape long-to-wide functions expect that the variable that becomes wide itself a categorical text-field---that is, this is not a long-to-wide reshape operation because I don't want a variable named "MyGuess" "BadGuess" and "Another", but I want a generic "Answer.X" variable containing these values. I'm not trying to aggregate in any way, such as mean or sum, just list the value in a new place.
So, two directions for this question:
If your data is in a data.table it's a one-liner can be done as follows:
library(data.table)
mturk.dt <- as.data.table(mturk)
mturk.dt[, as.list(
rbind(c(Answer.Q1thing, AssignmentStatus))
)
, by=list(Id=Input.id, State=Input.State)]
Note that the by argument handles the name-changing too!
If you want to properly name the other columns, use setnames after the fact or, more dynamically, using setattr within the j=.. argument as follows:
## Assuming 'res' is the reshaped data.table form above:
## Change the names of the six V1, V2.. columns
setnames(res, paste0("V", 1:6), c(paste0("Answer", 1:3), paste0("Status", 1:3)))
j=..## Use `as.data.table` instead of `as.list`, to preserve new names
mturk.dt[, as.data.table(
rbind(c(
setattr(Answer.Q1thing, "names", paste0("Answer", seq(Answer.Q1thing )))
, setattr(AssignmentStatus, "names", paste0("Status", seq(AssignmentStatus)))
))
)
, by=list(Id=Input.id, State=Input.State)]
Id State Answer1 Answer2 Answer3 Status1 Status2 Status3
1: 134231 NY Myguess Myguess BadGuess Approved Approved Approved
2: 134812 CA Myguess Myguess BadGuess Approved Approved Approved
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With