Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

looping over variable names in R

Tags:

loops

r

I am a recent convert to R and am struggling to find the R equivalent of the following: looping over variables named with a common prefix plus a number (var1, var2, ..., varn).

Say I have a dataset where each row is a store and each column is the value of that store's revenue in month 1, month 2...month 6. Some made-up data for example:

store = c("a", "b", "c", "d", "c")
rev1 = c(500, 200, 600, 400, 1200) 
rev2 = c(260, 100, 450, 45, 1300)
rev3 = c(500, 150, 610, 350, 900)
rev4 = c(480, 200, 600, 750, 1000)
rev5 = c(500, 68, 750, 350, 1200)
rev6 = c(510, 80, 1000, 400, 1450)
df = data.frame(store, rev1, rev2, rev3, rev4, rev5, rev6) 

I am trying to do something like the following:

varlist <- paste("rev", 1:6)  #create list of variables rev1-rev6 #
for i in varlist {
      highrev[i] <- ifelse(rev[i] > 500, 1, 0) 
}

So for each existing variable rev1:rev6, create a variable highrev1:highrev6 which equals 1 if rev1:rev6 > 500 and 0 otherwise.

Can you suggest an appropriate means of doing this?

like image 290
Kat Avatar asked Dec 03 '25 23:12

Kat


2 Answers

In R, we usually don't use loops for such operations. You could simply do:

df[paste0("highrev", 1:6)] <- (df[paste0("rev", 1:6)] > 500) + 0
df
#   store rev1 rev2 rev3 rev4 rev5 rev6 highrev1 highrev2 highrev3 highrev4 highrev5 highrev6
# 1     a  500  260  500  480  500  510        0        0        0        0        0        1
# 2     b  200  100  150  200   68   80        0        0        0        0        0        0
# 3     c  600  450  610  600  750 1000        1        0        1        1        1        1
# 4     d  400   45  350  750  350  400        0        0        0        1        0        0
# 5     c 1200 1300  900 1000 1200 1450        1        1        1        1        1        1
like image 81
David Arenburg Avatar answered Dec 06 '25 13:12

David Arenburg


setup

varlist  <- paste0("rev",1:6)      # note that this is paste0, not paste
hvarlist <- paste0("hi",varlist)

data.table solution. There is a nice way to do this in data.table:

require(data.table)
setDT(df)[,(hvarlist):=lapply(.SD,function(x)1L*(x>500)),.SDcols=varlist]
#    store rev1 rev2 rev3 rev4 rev5 rev6 hirev1 hirev2 hirev3 hirev4 hirev5 hirev6
# 1:     a  500  260  500  480  500  510      0      0      0      0      0      1
# 2:     b  200  100  150  200   68   80      0      0      0      0      0      0
# 3:     c  600  450  610  600  750 1000      1      0      1      1      1      1
# 4:     d  400   45  350  750  350  400      0      0      0      1      0      0
# 5:     c 1200 1300  900 1000 1200 1450      1      1      1      1      1      1

The dplyr package is also designed with this sort of operation in mind...but simply cannot do it.


A bad alternative. Here's another way, hewing closely to the OP's loop:

within(df,{for(i in 1:6) assign(hvarlist[i],1L*(get(varlist[i]) > 500));rm(i)})
#   store rev1 rev2 rev3 rev4 rev5 rev6 hirev6 hirev5 hirev4 hirev3 hirev2 hirev1
# 1     a  500  260  500  480  500  510      1      0      0      0      0      0
# 2     b  200  100  150  200   68   80      0      0      0      0      0      0
# 3     c  600  450  610  600  750 1000      1      1      1      1      0      1
# 4     d  400   45  350  750  350  400      0      0      1      0      0      0
# 5     c 1200 1300  900 1000 1200 1450      1      1      1      1      1      1

You can't assign to dynamic variable names with hvarlist[i] <- ...; this is done instead with assign(hvarlist[i],...), but using the latter is not a good habit. Similarly, get must be used to grab a variable on the basis of a string containing its name.

like image 27
Frank Avatar answered Dec 06 '25 13:12

Frank