I am trying to do a function of decumulation with a for loop in R because the financial information provided by the company is accumulated for different concepts (this means that the info of January is only of January, the info of February is the sum of January and February, the one of March is the sum of January, February and March, etc.).
For example, let's say that I have the next dataframe:
Concepts <- c("Concept1", "Concept2", "Concept3")
January <- c(5,10,16)
February <- c(9,14,20)
March <- c(16,20,23)
df <- data.frame(Concepts, January, February, March)
This will give me the next dataframe:
Concepts January February March
Concept1 5 9 16
Concept2 10 14 20
Concept3 16 20 23
What I need to achieve is the next dataframe (Notice that February is the difference between February and January, and March is the difference between February and March):
Concepts January February March
Concept1 5 4 7
Concept2 10 4 6
Concept3 16 4 3
To achieve the second dataframe, I first created an empty dataframe with the same amount of rows of df, then with a for loop cbind the first two rows of the dataframe (because they do not need any manipulation) and with the index add the next ones after calculated the difference. The above in code is as follows:
df <- data.frame(Concepts, January, February, March)
df2 <- data.frame(matrix(nrow=nrow(df),ncol=ncol(df))) #Empty Dataframe with the same number of rows
for(i in 1:ncol(df)) {
if(i == 1){
df2 <- cbind(df2, df[ , i])
} else if (i == 2){
df2 <- cbind(df2, df[, i])
} else {
diference <- df[,i] - df[,i-1]
df2 <- cbind(df2,diference)
}
I get the following error:
error in
[.data.table
(df, , i) : j (the 2nd argument inside [...]) is a single symbol but column name 'i' is not found. Perhaps you intended DT[, ..i]. This difference to data.frame is deliberate and explained in FAQ 1.1.
I would love to receive a correction to my code or some alternative that allows me to calculate the above for a dataframe of many years.
For loops are not as important in R as they are in other languages because R is a functional programming language. This means that it's possible to wrap up for loops in a function, and call that function instead of using the for loop directly.
For loop in R Programming Language is useful to iterate over the elements of a list, dataframe, vector, matrix, or any other object. It means, the for loop can be used to execute a group of statements repeatedly depending upon the number of elements in the object.
First note that if you apply
base function diff
to the months columns, you will get one column less but transposed.
apply(df[-1], 1, diff)
# [,1] [,2] [,3]
#February 4 4 4
#March 7 6 4
So transpose it to get the right orientation.
t(apply(df[-1], 1, diff))
# February March
#[1,] 4 7
#[2,] 4 6
#[3,] 4 4
And cbind
it with the first 2 columns. Since the first argument is a subset of a data.frame, the method called is cbind.data.frame
and the result is also a df.
cbind(df[1:2], t(apply(df[-1], 1, diff)))
# Concepts January February March
#1 Concept1 5 4 7
#2 Concept2 10 4 6
#3 Concept3 15 4 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With