Is it possible to restrict a data frame to a specific row and then change some values in one of the columns?
Let's say I calculate GROWTH
as (SIZE_t+1 - SIZE_t)/SIZE_t
and now I can see that there are some strange values for GROWTH
(e.g. 1000) and the reason is a corrupt value of the corresponding SIZE
variable. Now I'd like to find and replace the corrupt value of SIZE
.
If I type:
data <- mutate(filter(data, lead(GROWTH)==1000), SIZE = 2600)
then only the corrupt row is stored in data
and the rest of my data frame is lost.
What I'd like to do instead is filter "data" on the left hand side to the corresponding row of the corrupt value and then mutate the incorrect variable (on the right hand side):
filter(data, lead(GROWTH)==1000) <- mutate(filter(data, lead(GROWTH)==1000), SIZE = 2600)
but that doesn't seem to work. Is there a way to handle this using dplyr? Many thanks in advance
You can use an ifelse
statement with mutate
function. Let's say you have a data frame with some corrupted values in SIZE at row 3 which lead to a large GROWTH value at row 4 and you want to replace the SIZE at row 3, with some value 0.3
here(I chose to be different from yours just to be consistent with my values). The GROWTH > 1000
condition can be replaced accordingly.
data
SIZE GROWTH
1 -1.49578498 NA
2 -0.38731784 -0.7410605
3 0.00010000 -1.0002582
4 0.53842217 5383.2216758
5 -0.65813674 -2.2223433
6 0.29830698 -1.4532599
7 0.04712019 -0.8420413
8 -0.07312482 -2.5518788
9 1.64310713 -23.4698959
10 1.44927727 -0.1179654
library(dplyr)
data %>% mutate(SIZE = ifelse(lead(GROWTH > 1000, default = F), 0.3, SIZE))
SIZE GROWTH
1 -1.49578498 NA
2 -0.38731784 -0.7410605
3 0.30000000 -1.0002582
4 0.53842217 5383.2216758
5 -0.65813674 -2.2223433
6 0.29830698 -1.4532599
7 0.04712019 -0.8420413
8 -0.07312482 -2.5518788
9 1.64310713 -23.4698959
10 1.44927727 -0.1179654
Data:
structure(list(SIZE = c(-1.49578498093657, -0.387317841955887,
1e-04, 0.538422167582116, -0.658136741561064, 0.298306980856383,
0.0471201873908915, -0.0731248216938637, 1.64310713116132, 1.44927727104653
), GROWTH = c(NA, -0.741060482026387, -1.00025818588551, 5383.22167582116,
-2.22234332311492, -1.45325988053609, -0.842041284935343, -2.55187883883499,
-23.4698958999199, -0.117965442690154)), class = "data.frame", .Names = c("SIZE",
"GROWTH"), row.names = c(NA, -10L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With