I have created the below-mentioned data frame in R.
My_DF
ID Date Type Remark Price
PRT-11 2020-12-01 10:12:14 SS_RT AT_1_O 1000
PRT-11 2020-12-01 10:12:14 SS_RT AT_1_O 1200
PRT-11 2020-12-01 10:12:14 SS_RT AT_1_O 1600
PRT-11 2020-12-01 10:12:14 SS_RG AT_1_A 1600
PRT-11 2020-12-01 10:12:14 SS_RG AT_1_B 1600
PRT-11 2020-12-01 10:12:14 SS_RG AT_1_C 1000
PRT-11 2020-12-01 10:12:14 SS_RT AT_1_Y 1200
PRT-11 2020-12-07 10:12:14 SS_RT AT_1_U 1600
PRT-11 2020-12-07 10:12:14 SS_RI AT_1_M 1600
PRT-11 2020-12-07 10:12:14 SS_RO AT_1_P 1600
I want to covert the above-mentioned DF in the following structure Dataframe and convert it in HTML format which can be used to send email using mailR library.

Where I have followed the following condition.
If Type is equal to SS_RT then it is Type - A
If Type is anything other than SS_RT then it is Type - B
If Type is equal to SS_RT with Remark is equal to AT_1_O then it is Type - A1
If Type is equal to SS_RT with Remark other than AT_1_O then it is Type - A2
The Formula for Type - A1 (Excl) is Type - A1 divided by the sum of Type - A1and Type - A2
The Formula for Type - A1 (Excl) is Type - A2 divided by the sum of Type - A1and Type - A2
Rest all the %age formula are pretty straightforward by Total in the denominator.
In the data frame, it is possible the there is no entry for a particular date. For that, we need to ensure that of all the available dates we need to take min and max date and ensure that for that date which is not available we show value as 0 in bother count and sum column.
I have merged the date in two rows the first row is used for the count and the second one is for the sum group by their logical defination.
Here is a data.table solution. I tried to avoid manual calculations and though of a solution based on long to wide transformation.
Here is my solution, with step by step detail after:
library(lubridate)
library(data.table)
dt <- setDT(dt)
dt[,Date := date(Date)]
dt[,type := fifelse(Type == "SS_RT",fifelse(Remark == "AT_1_O","A1","A2"),"B")]
## transform to wide
df2 <- rbind(dcast(data = dt,Date~type ,value.var = "Price",fill = 0)[,linetype := "count"],
dcast(data = dt,Date~type ,value.var = "Price",fill = 0,fun.aggregate = sum)[,linetype := "value"])
## A and tot
df2[,tot := rowSums(.SD),.SDcols = c("A1","A2","B")]
df2[,A := A1+A2]
## create pc
cols <- c("A","A1","A2","B")
df2[,paste0(cols,"_pc") := lapply(.SD,function(x) round(x/tot*100) ),.SDcols = cols]
cols <- c("A1","A2")
df2[,paste0(cols,"_exc") := lapply(.SD,function(x) round(x/(A1+A2)*100) ),.SDcols = cols]
## add missing dates
df2 <- merge(CJ(Date = seq(min(dt$Date),max(dt$Date),1),linetype = c("count","value")),
df2,all = T,by = c("Date","linetype"))
df2[is.na(df2)] <- 0
df2[,linetype := NULL]
df2
Date A1 A2 B tot A A_pc A1_pc A2_pc B_pc A1_exc A2_exc
1: 2020-12-01 3 1 3 7 4 57 43 14 43 75 25
2: 2020-12-01 3800 1200 4200 9200 5000 54 41 13 46 76 24
3: 2020-12-02 0 0 0 0 0 0 0 0 0 0 0
4: 2020-12-02 0 0 0 0 0 0 0 0 0 0 0
5: 2020-12-03 0 0 0 0 0 0 0 0 0 0 0
6: 2020-12-03 0 0 0 0 0 0 0 0 0 0 0
7: 2020-12-04 0 0 0 0 0 0 0 0 0 0 0
8: 2020-12-04 0 0 0 0 0 0 0 0 0 0 0
9: 2020-12-05 0 0 0 0 0 0 0 0 0 0 0
10: 2020-12-05 0 0 0 0 0 0 0 0 0 0 0
11: 2020-12-06 0 0 0 0 0 0 0 0 0 0 0
12: 2020-12-06 0 0 0 0 0 0 0 0 0 0 0
13: 2020-12-07 0 1 2 3 1 33 0 33 67 0 100
14: 2020-12-07 0 1600 3200 4800 1600 33 0 33 67 0 100
So first step is I create the type variable following your rule:
dt[,Date := date(Date)]
dt[,type := fifelse(Type == "SS_RT",fifelse(Remark == "AT_1_O","A1","A2"),"B")]
We know A is just A1 + A2. It allows me to transform the table to wide format. I do it twice: once to count, once to make the sum per type:
dcast(data = dt,Date ~ type ,value.var = "Price",fill = 0)
Date A1 A2 B
1: 2020-12-01 3 1 3
2: 2020-12-07 0 1 2
Here I count the number of occurrence for each type, because it uses the default aggregate: lenght.
If I use sum as aggregate function:
dcast(data = dt,Date~type ,value.var = "Price",fill = 0,fun.aggregate = sum)
Date A1 A2 B
1: 2020-12-01 3800 1200 4200
2: 2020-12-07 0 1600 3200
I add the linetype variable, which will help me after to add the missing dates (I use it to keep two lines per dates).
I bind the two, I obtain:
Date A1 A2 B linetype
1: 2020-12-01 3 1 3 count
2: 2020-12-07 0 1 2 count
3: 2020-12-01 3800 1200 4200 value
4: 2020-12-07 0 1600 3200 value
I then calculate A and the total:
df2[,tot := rowSums(.SD),.SDcols = c("A1","A2","B")]
df2[,A := A1+A2]
I then calculate the percentage (_pc) and Excl variables (that I names _exc for simplicity), using lapply and a vector of the column I want to transform. I use fifelse to avoid dividing by 0:
cols <- c("A","A1","A2","B")
df2[,paste0(cols,"_pc") := lapply(.SD,function(x) round(x/tot*100) ),.SDcols = cols]
cols <- c("A1","A2")
df2[,paste0(cols,"_exc") := lapply(.SD,function(x) round(x/(A1+A2)*100) ),.SDcols = cols]
Date A1 A2 B linetype tot A A_pc A1_pc A2_pc B_pc A1_exc A2_exc
1: 2020-12-01 3 1 3 count 7 4 57 43 14 43 75 25
2: 2020-12-01 3800 1200 4200 value 9200 5000 54 41 13 46 76 24
3: 2020-12-07 0 1 2 count 3 1 33 0 33 67 0 100
4: 2020-12-07 0 1600 3200 value 4800 1600 33 0 33 67 0 100
I then add the missing dates, by merging with all combination of linetype and Date and keeping all rows. I use the CJ function to create a data.table with all combinations of the two variables:
CJ(Date = seq(min(dt$Date),max(dt$Date),1),linetype = c("count","value"))
Date linetype
1: 2020-12-01 count
2: 2020-12-01 value
3: 2020-12-02 count
4: 2020-12-02 value
5: 2020-12-03 count
6: 2020-12-03 value
7: 2020-12-04 count
8: 2020-12-04 value
9: 2020-12-05 count
10: 2020-12-05 value
11: 2020-12-06 count
12: 2020-12-06 value
13: 2020-12-07 count
14: 2020-12-07 value
And then replace missing values with 0 and supress the linetype variable.
You can then reorder the columns with setcolorder, and use kabbleExtra (see here) to produce your html output.
You can do the same with dplyr, using pivot_wider to transform to wide, mutate_all instead of the lapply(.SD,...) to do your calculations, expand.grid instead of CJ to generate the table of missing dates.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With