I am trying to understand how to conditional replace values in a dataframe without using a loop. My data frame is structured as follows:
> df a b est 1 11.77000 2 0 2 10.90000 3 0 3 10.32000 2 0 4 10.96000 0 0 5 9.90600 0 0 6 10.70000 0 0 7 11.43000 1 0 8 11.41000 2 0 9 10.48512 4 0 10 11.19000 0 0
and the dput
output is this:
structure(list(a = c(11.77, 10.9, 10.32, 10.96, 9.906, 10.7, 11.43, 11.41, 10.48512, 11.19), b = c(2, 3, 2, 0, 0, 0, 1, 2, 4, 0), est = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("a", "b", "est"), row.names = c(NA, -10L), class = "data.frame")
What I want to do, is to check the value of b
. If b
is 0, I want to set est
to a value from a
. I understand that df$est[df$b == 0] <- 23
will set all values of est
to 23, when b==0
. What I don't understand is how to set est
to a value of a
when that condition is true. For example:
df$est[df$b == 0] <- (df$a - 5)/2.533
gives the following warning:
Warning message: In df$est[df$b == 0] <- (df$a - 5)/2.533 : number of items to replace is not a multiple of replacement length
Is there a way that I can pass the relevant cell, rather than vector?
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values.
Since you are conditionally indexing df$est
, you also need to conditionally index the replacement vector df$a
:
index <- df$b == 0 df$est[index] <- (df$a[index] - 5)/2.533
Of course, the variable index
is just temporary, and I use it to make the code a bit more readible. You can write it in one step:
df$est[df$b == 0] <- (df$a[df$b == 0] - 5)/2.533
For even better readibility, you can use within
:
df <- within(df, est[b==0] <- (a[b==0]-5)/2.533)
The results, regardless of which method you choose:
df a b est 1 11.77000 2 0.000000 2 10.90000 3 0.000000 3 10.32000 2 0.000000 4 10.96000 0 2.352941 5 9.90600 0 1.936834 6 10.70000 0 2.250296 7 11.43000 1 0.000000 8 11.41000 2 0.000000 9 10.48512 4 0.000000 10 11.19000 0 2.443743
As others have pointed out, an alternative solution in your example is to use ifelse
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With