I need to set certain numeric values in one column of my data frame to zero, if in another column they have a certain factor level.
My dataframe df looks something like:
Items Store.Type
5 A
4 B
3 C
6 D
3 B
7 E
What I want to do is make Items = 0, for all rows where Store.Type = "A" or "C"
I'm very new to R, but figured this this would be a conditional statement of the form "If Store.Type A then Items <- 0" (and then repeat for Store.Type C), but I didn't understand the ?"if"
page at all. I tried:
df$ItemsFIXED <- with(df, if(Store.Type == "A")Items <-0)
and got the warning message:
Warning message:
In if (Store.Type2 == "Chain - Brand") Total.generic.items <- 0 :
the condition has length > 1 and only the first element will be used`
So I noticed here, the following:
if
is a control flow statement, taking a single logical value as an argumentifelse
is a vectorised function, taking vectors as all its arguments.
So figuring I need ifelse
to do the whole column and being able to understand the ?ifelse
page, I tried to do "If Store.Type A then Items <- 0 else do nothing". In fact I wanted it nested, so I tried the following code (creating a new column for now so I don't mess up my data, but eventually it will overwrite the Items data)
df$ItemsFIXED <- with(df, ifelse(Store.Type == "A", Items <-0,
ifelse(Store.Type == "C", Items <-0,)))
and got the following error:
Error in ifelse(Store.Type2 == "Franchise - Brand", Total.generic.items <- 0, :
argument "no" is missing, with no default
But if I put anything in for no
it simply writes over the values which are correct. I tried putting Items
and Items <- Items
in to say "else leave Items as Items" as in the following, but this just changed everything to zero.
df$ItemsFIXED <- with(df, ifelse(Store.Type == "A", Items <-0,
ifelse(Store.Type == "C", Items <-0,Items)))
Is there a way to tell ifelse
to do nothing, or is there an easier way to do this?
Or you could use %in%
for multiple match/replacement
df$Items[df$Store.Type %in% c("A", "C")] <- 0
df
#Items Store.Type
#1 0 A
#2 4 B
#3 0 C
#4 6 D
#5 3 B
#6 7 E
Using within
seems to be also an option:
within(d, Items[Store.Type %in% c("A","C")]<-0)
Items Store.Type
1 0 A
2 4 B
3 0 C
4 6 D
5 3 B
6 7 E
You can use vectorized replacement here. If df
is your data set,
> df$Items[with(df, Store.Type == "A" | Store.Type == "C")] <- 0L
> df
# Items Store.Type
# 1 0 A
# 2 4 B
# 3 0 C
# 4 6 D
# 5 3 B
# 6 7 E
with(df, Store.Type == "A" | Store.Type == "C")
returns a logical vector. When a logical vector is placed inside [...]
, only the TRUE
values are returned. So if we subset Items
with those values, we can replace them with [<-
Also, if you wanted to use ifelse
, you could do things like
df$Items <- with(df, ifelse(Store.Type == "A" | Store.Type == "C", 0L, Items))
or
within(df, Items <- ifelse(Store.Type == "A" | Store.Type == "C", 0L, Items))
but take note that ifelse
can be very slow at times, even more so when coupled with within
, and will likely always be slower than the vectorized method up top.
Following also works:
> ddf[ddf$Store.Type=='A'| ddf$Store.Type=='C',]$Items = 0
> ddf
Items Store.Type
1 0 A
2 4 B
3 0 C
4 6 D
5 3 B
6 7 E
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With