Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using R list for lookup

Tags:

r

I'm working with a large set of survey responses and trying to do a lookup (to get question titles into my dataframe) using values stored in a list. I fear I'm overlooking something simple with my indexing but I just can't make it happen. Here is a reproducible example:

survey.data <- data.frame(
  question.number = c("q2","q3","q4","q5")
)

titles <- list(q1="question1", 
           q2="question2", 
           q3="question3", 
           q4="question4", 
           q5="question5")

After a bit of data manipulation that involves removing some questions, I try to create a new question.title variable in my data frame using the following bit of list indexing to pull in the correct titles:

survey.data$question.title <- titles[survey.data$question.number]

which gives the output:

    question.number question.title
1   q2  question1
2   q3  question2
3   q4  question3
4   q5  question4

You can see that the new variable is not applying a lookup, just 'importing' all values of the list beginning at the first one.

I can't find any applicable help on doing this kind of lookup with a list, so perhaps it just isn't advisable? I'd be very grateful for a fix or an alternative.

like image 766
peter_w Avatar asked Dec 07 '25 01:12

peter_w


1 Answers

Here is one solution, but before sharing it, I've modified your data a little bit by adding a repeated question ("q2"):

survey.data <- data.frame(
  question.number = c("q2","q3","q4","q5", "q2")
)

titles <- list(q1="question1", 
               q2="question2", 
               q3="question3", 
               q4="question4", 
               q5="question5")

The solution uses match and unlist.

survey.data$question.title <- unlist(titles[match(survey.data$question.number, 
                                                  names(titles))])
survey.data
#   question.number question.title
# 1              q2      question2
# 2              q3      question3
# 3              q4      question4
# 4              q5      question5
# 5              q2      question2

How does this differ from the two solutions already present at the time of writing this?

Two main ways:

  1. Neither of those solutions will accommodate the duplicated "q2" question.

    > survey.data$question.title <- titles[names(titles) %in% survey.data$question.number]
    Error in `$<-.data.frame`(`*tmp*`, "question.title", value = list(q2 = "question2",  : 
      replacement has 4 rows, data has 5
    > survey.data$question.title <- titles[levels(survey.data$question.number)]
    Error in `$<-.data.frame`(`*tmp*`, "question.title", value = list(q2 = "question2",  : 
      replacement has 4 rows, data has 5
    
  2. Both of the other solutions retain a list structure for the "question.title" column (as this solution would also do, if it were not for the use of unlist) which will be problematic if you are trying to do things like export the data to a csv file later on. It's particularly troublesome because there is no visual indication that the resulting column is a list, but you can verify that by viewing the structure of the resulting data.frame.
like image 120
A5C1D2H2I1M1N2O1R2T1 Avatar answered Dec 08 '25 15:12

A5C1D2H2I1M1N2O1R2T1



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!