Related to question on this page: Randomly associate elements of two vectors given conditions If I have following data:
loss=c(45,10,5,1)
capitals = structure(list(capital = c(100L, 50L, 4L, 25L, 5L), loss = c(5L,
10L, 10L, 1L, 45L)), .Names = c("capital", "loss"), class = "data.frame", row.names = c(NA,
-5L))
capitals
capital loss
1 100 5
2 50 10
3 4 10
4 25 1
5 5 45
>
I am trying to correct any row with loss>capital (assign another random value from vector loss so that loss<=capital) by following command:
apply(capitals, 1, function(x){while(x[2]>x[1]) {x[2] = sample(loss,1); print(x[2])} })
print function shows that the value is changing in the function but value is not changing in dataframe capitals:
apply(capitals, 1, function(x){while(x[2]>x[1]) {x[2] = sample(loss,1); print(x[2])} })
loss
5
loss
10
loss
10
loss
1
loss
5
NULL
> capitals
capital loss
1 100 5
2 50 10
3 4 10
4 25 1
5 5 45
>
Why is value in capitals dataframe not changing and how can this be corrected? Thanks for your help.
apply is evaluating a function, and assignment within functions do not affect the enclosing environment. A copy is being modified, and that copy is destroyed when the function exits.
Instead, to make use of apply, you should build an object, letting apply return each element. Something like this perhaps:
capitals$loss <-
apply(capitals, 1,
function(x){
while(x[2]>x[1])
x[2] <- sample(loss,1)
x[2]
}
)
capitals
## capital loss
## 1 100 5
## 2 50 10
## 3 4 1
## 4 25 1
## 5 5 5
Here, the new value for loss (x[2]) is returned from the function, and collected into a vector by apply. This is then used to replace the column in the data frame.
This can be done without the while loop, by sampling the desired subset of loss. An if is required to determine if sampling is needed:
apply(capitals, 1,
function(x)
if (x[2] > x[1])
sample(loss[loss<=x[1]], 1)
else
x[2]
)
Better yet, instead of using if, you can replace only those rows where the condition holds:
r <- capitals$capital < capitals$loss
capitals[r, 'loss'] <-
sapply(capitals[r,'capital'],
function(x) sample(loss[loss<=x], 1)
)
Here, the rows where replacement is needed is represented by r and only those rows are modified (this is the same condition present for the while in the original, but the order of the elements has been swapped -- thus the change from greater-than to less-than).
The sapply expression loops through the values of capital for those rows, and returns a single sample from those entries of loss that do not exceed the capital value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With