Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reshape in R without aggregation (for example MTurk response strings)

Ordinarily, I'd use a pretty basic long-to-wide reshape for this, but it seems to be dropping my aggregation variables. The setup is I had a job on mechanical Turk that I performed in triplicate---I want MTurk1, Mturk2, MTurk3's answers to be their own variables in the data frame but uniquely id'd by a field I input with the job, so that I can compare them against each other with a function later.

Current Format:

> head(mturk)
  AssignmentStatus     Input.id  Input.State  Answer.Q1thing
1         Approved       134231           NY         Myguess
2         Approved       134231           NY         Myguess
3         Approved       134231           NY        BadGuess
4        Submitted       134812           CA         Another
5         Approved       134812           CA         Another
6         Approved       134812           CA         Another

I'd like this to become

Input.id   Input.State Answer.Q1thing.1 Answer.Q1thing.2 Answer.Q1thing.3  AssignmentStatus.1 AssignmentStatus.2  AssignmentStatus.3
134231              NY          Myguess          Myguess         BadGuess          Approved             Approved            Approved
134812              CA          Another          Another          Another         Submitted             Approved            Approved

or ideally, if there's a variable that can redo column names in the operation....

Id               State          Answer1          Answer2          Answer3          Status1               Status2             Status3
134231              NY          Myguess          Myguess         BadGuess          Approved             Approved            Approved
134812              CA          Another          Another          Another         Submitted             Approved            Approved

dat <- reshape(mturk, timevar="Answer.Q1thing", idvar=c("Input.id", "Input.state"), direction="wide")

This seems to be failing because most reshape long-to-wide functions expect that the variable that becomes wide itself a categorical text-field---that is, this is not a long-to-wide reshape operation because I don't want a variable named "MyGuess" "BadGuess" and "Another", but I want a generic "Answer.X" variable containing these values. I'm not trying to aggregate in any way, such as mean or sum, just list the value in a new place.

So, two directions for this question:

  1. Does this kind of operation have another name? Is this an unfold, unpivot, uncast or something?
  2. How to do this in R?
like image 766
Mittenchops Avatar asked Dec 22 '25 13:12

Mittenchops


1 Answers

If your data is in a data.table it's a one-liner can be done as follows:

library(data.table)    
mturk.dt <- as.data.table(mturk)

mturk.dt[, as.list(
         rbind(c(Answer.Q1thing, AssignmentStatus))
         )
        , by=list(Id=Input.id, State=Input.State)]

Note that the by argument handles the name-changing too!


If you want to properly name the other columns, use setnames after the fact or, more dynamically, using setattr within the j=.. argument as follows:

After the Fact:

## Assuming 'res' is the reshaped data.table form above:
## Change the names of the six V1, V2.. columns 
setnames(res, paste0("V", 1:6), c(paste0("Answer", 1:3), paste0("Status", 1:3)))

Dynamically, in j=..

## Use `as.data.table` instead of `as.list`, to preserve new names
mturk.dt[, as.data.table(
         rbind(c(
              setattr(Answer.Q1thing,   "names", paste0("Answer", seq(Answer.Q1thing  )))
            , setattr(AssignmentStatus, "names", paste0("Status", seq(AssignmentStatus)))
            ))
         )
        , by=list(Id=Input.id, State=Input.State)]

       Id State Answer1 Answer2  Answer3  Status1  Status2  Status3
1: 134231    NY Myguess Myguess BadGuess Approved Approved Approved
2: 134812    CA Myguess Myguess BadGuess Approved Approved Approved
like image 114
Ricardo Saporta Avatar answered Dec 24 '25 02:12

Ricardo Saporta



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!