Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert rows in a dataframe based on a calculation

Tags:

dataframe

r

rep

I am pretty new to R and I am trying to copy a calculation done in Excel to R.
I have a data frame like this:

Component <- c("A", "B", "C")
Report_Time <- c(5781, 5781, 5781)
Interval <- c(700, 600, 800)
End_Time <- c(8281, 8281, 8281)
Start_Time <- c(800, 298, 780)
df <- data.frame(Component, Report_Time, Interval, End_Time, Start_Time)

When Printed it looks like this:

# Component Report_Time Interval    End_Time    Start_Time
#1    A         5781        700         8281        800
#2    B         5781        600         8281        298
#3    C         5781        800         8281        780

For each component, I want to populate a calculated column "Interval_Time", which is the sum of Start Time + Report_Time for First, then if it is less than End_Time the insert a row with the sum of Interval_Time (Last sum) + Interval. Repeat inserting till the sum in Interval time is less than End_Time.

# Component Report_Time Interval    End_Time    Start_Time  Interval_Time
#1   A       5781       700             8281        800         6581
#2   A       5781       700             8281        800         7281
#3   A       5781       700             8281        800         7981
#4   B       5781       1000            8281        298         6079        
#5   B       5781       1000            8281        298         7079
#6   B       5781       1000            8281        298         8079
#7   C       5781       1200            8281        780         6561
#8   C       5781       1200            8281        780         7761

I have been trying to achive this with if inside a for loop.. but haven't been succesfull.

like image 210
Madhu Avatar asked Oct 17 '22 13:10

Madhu


1 Answers

With data.table:

Component <- c("A", "B", "C")
Report_Time <- c(5781, 5781, 5781)
Interval <- c(700, 1000, 1200)
End_Time <- c(8281, 8281, 8281)
Start_Time <- c(800, 298, 780)
df <- data.frame(Component, Report_Time, Interval, End_Time, Start_Time)

library(data.table)
setDT(df)
df<-df[rep(1:.N,ceiling((End_Time-Start_Time-Report_Time)/Interval))]
df[,Interval_Time:=ifelse(.I==1,Start_Time+Report_Time,Start_Time+cumsum(Interval)+Report_Time-Interval),by=.(Component)]

df
Component Report_Time Interval End_Time Start_Time Interval_Time
1:         A        5781      700     8281        800          6581
2:         A        5781      700     8281        800          7281
3:         A        5781      700     8281        800          7981
4:         B        5781     1000     8281        298          6079
5:         B        5781     1000     8281        298          7079
6:         B        5781     1000     8281        298          8079
7:         C        5781     1200     8281        780          6561
8:         C        5781     1200     8281        780          7761
like image 90
Erdem Akkas Avatar answered Oct 20 '22 09:10

Erdem Akkas