Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change numeric values in one column based on factor levels in another column

I need to set certain numeric values in one column of my data frame to zero, if in another column they have a certain factor level.

My dataframe df looks something like:

Items Store.Type
5      A
4      B
3      C
6      D
3      B
7      E

What I want to do is make Items = 0, for all rows where Store.Type = "A" or "C"

I'm very new to R, but figured this this would be a conditional statement of the form "If Store.Type A then Items <- 0" (and then repeat for Store.Type C), but I didn't understand the ?"if" page at all. I tried:

df$ItemsFIXED <- with(df, if(Store.Type == "A")Items <-0)

and got the warning message:

Warning message:
In if (Store.Type2 == "Chain - Brand") Total.generic.items <- 0 :
 the condition has length > 1 and only the first element will be used`

So I noticed here, the following:

  • if is a control flow statement, taking a single logical value as an argument
  • ifelse is a vectorised function, taking vectors as all its arguments.

So figuring I need ifelse to do the whole column and being able to understand the ?ifelse page, I tried to do "If Store.Type A then Items <- 0 else do nothing". In fact I wanted it nested, so I tried the following code (creating a new column for now so I don't mess up my data, but eventually it will overwrite the Items data)

df$ItemsFIXED <- with(df, ifelse(Store.Type == "A", Items <-0, 
                          ifelse(Store.Type == "C", Items <-0,)))

and got the following error:

Error in ifelse(Store.Type2 == "Franchise - Brand", Total.generic.items <- 0,  : 
  argument "no" is missing, with no default

But if I put anything in for no it simply writes over the values which are correct. I tried putting Items and Items <- Items in to say "else leave Items as Items" as in the following, but this just changed everything to zero.

df$ItemsFIXED <- with(df, ifelse(Store.Type == "A", Items <-0, 
                          ifelse(Store.Type == "C", Items <-0,Items)))

Is there a way to tell ifelse to do nothing, or is there an easier way to do this?

like image 905
JenLouise Avatar asked Sep 18 '14 02:09

JenLouise


4 Answers

Or you could use %in% for multiple match/replacement

 df$Items[df$Store.Type %in% c("A", "C")] <- 0
  df
  #Items Store.Type
  #1     0          A
  #2     4          B
  #3     0          C
  #4     6          D
  #5     3          B
  #6     7          E
like image 60
akrun Avatar answered Oct 09 '22 15:10

akrun


Using within seems to be also an option:

within(d, Items[Store.Type %in% c("A","C")]<-0)

  Items Store.Type
1     0          A
2     4          B
3     0          C
4     6          D
5     3          B
6     7          E
like image 41
ddiez Avatar answered Oct 09 '22 15:10

ddiez


You can use vectorized replacement here. If df is your data set,

> df$Items[with(df, Store.Type == "A" | Store.Type == "C")] <- 0L
> df
#   Items Store.Type
# 1     0          A
# 2     4          B
# 3     0          C
# 4     6          D
# 5     3          B
# 6     7          E

with(df, Store.Type == "A" | Store.Type == "C") returns a logical vector. When a logical vector is placed inside [...], only the TRUE values are returned. So if we subset Items with those values, we can replace them with [<-

Also, if you wanted to use ifelse, you could do things like

df$Items <- with(df, ifelse(Store.Type == "A" | Store.Type == "C", 0L, Items))

or

within(df, Items <- ifelse(Store.Type == "A" | Store.Type == "C", 0L, Items))

but take note that ifelse can be very slow at times, even more so when coupled with within, and will likely always be slower than the vectorized method up top.

like image 36
Rich Scriven Avatar answered Oct 09 '22 15:10

Rich Scriven


Following also works:

> ddf[ddf$Store.Type=='A'| ddf$Store.Type=='C',]$Items = 0
> ddf
  Items Store.Type
1     0          A
2     4          B
3     0          C
4     6          D
5     3          B
6     7          E
like image 38
rnso Avatar answered Oct 09 '22 14:10

rnso