Forgive me I'm very new with R and I'm just looking at options currently in our SQL 2016 environment.
We currently have a requirement to provide cumulative performance returns. An example dataset is below:
FundID Date FundReturn
ABC 1987-10-31 0
ABC 1987-11-30 -9.28669
ABC 1987-12-31 3.08304
ABC 1988-01-31 -3.00125
ABC 1988-02-29 0.61238
ABC 1988-03-31 4.29258
ABC 1988-04-30 0.13697
ABC 1988-05-31 2.57786
ABC 1988-06-30 2.36947
ABC 1988-07-31 0.57114
ABC 1988-08-31 -1.21550
ABC 1988-09-30 7.09027
ABC 1988-10-31 3.45807
ABC 1988-11-30 1.12679
We would need to take this data set and apply a cumulative performance return measure to it so that the datset would look like the following:
FundID Date FundReturn FundReturnCumu100 FundReturnCumu0
ABC 1987-10-31 0 1 0
ABC 1987-11-30 -9.28669 0.9071331 -0.0928669
ABC 1987-12-31 3.08304 0.935100376 -0.064899624
ABC 1988-01-31 -3.00125 0.907035676 -0.092964324
ABC 1988-02-29 0.61238 0.912590181 -0.087409819
ABC 1988-03-31 4.29258 0.951763845 -0.048236155
ABC 1988-04-30 0.13697 0.953067476 -0.046932524
ABC 1988-05-31 2.57786 0.977636221 -0.022363779
ABC 1988-06-30 2.36947 1.000801018 0.000801018
ABC 1988-07-31 0.57114 1.006516993 0.006516993
ABC 1988-08-31 -1.2155 0.994282779 -0.005717221
ABC 1988-09-30 7.09027 1.064780113 0.064780113
ABC 1988-10-31 3.45807 1.101600954 0.101600954
ABC 1988-11-30 1.12679 1.114013684 0.114013684
I can create this in SQL with the following code:
SELECT
FundID
, [Date]
, FundReturn
, ISNULL (
EXP(SUM(LOG(ABS(NULLIF((FundReturn+100)/100, 1))))
OVER(ORDER BY FundID, [Date] ROWS UNBOUNDED PRECEDING))
,1) AS FundReturnCumu100
, ISNULL (
EXP(SUM(LOG(ABS(NULLIF((FundReturn+100)/100, 1))))
OVER(ORDER BY FundID, [Date] ROWS UNBOUNDED PRECEDING))
,1)-1 AS FundReturnCumu0
FROM #Worktable
ORDER BY [Date]
I also want to test if this is possible using an R function embedded into a SQL Stored Procedure to end up with the same result? The maths above is basically the product of the performance return over a timeseries, so is there a product function I could use that could create the same result datset?
Edit: I have knocked up the following so far which returns the basis of the dataset using sp_execute_external_script
:
EXEC sp_execute_external_script
@language = N'R'
, @script = N'OutputDataSet<-InputDataSet'
, @input_data_1 = N' SELECT *
FROM [InMemory].[dbo].[CumulativePerformanceTest]
ORDER BY [FundID],[Date]'
WITH RESULT SETS (
(
[FundID] NVARCHAR(50)
, [Date] DATE
, [FundReturn] NVARCHAR(255)
)
);
GO
What would I have to do to change the above to apply the FundReturnCumu100 and FundReturnCumu0 calculation in R?
Thanks
In SQL server also you can calculate cumulative sum by using sum function. We can use same table as sample table. select dept_no Department_no, count(empno) Employee_Per_Dept, sum(count(*)) over (order by deptno) Cumulative_Total from [DBO].
We can convert it to dplyr
code with
library(dplyr)
df1 %>%
arrange(FundID, Date) %>%
mutate(FundReturnCumu100 = exp(cumsum(log(abs((FundReturn + 100)/100)))),
FundReturnCumu0 = FundReturnCumu100 - 1)
# FundID Date FundReturn FundReturnCumu100 FundReturnCumu0
#1 ABC 1987-10-31 0.00000 1.0000000 0.0000000000
#2 ABC 1987-11-30 -9.28669 0.9071331 -0.0928669000
#3 ABC 1987-12-31 3.08304 0.9351004 -0.0648996237
#4 ABC 1988-01-31 -3.00125 0.9070357 -0.0929643237
#5 ABC 1988-02-29 0.61238 0.9125902 -0.0874098186
#6 ABC 1988-03-31 4.29258 0.9517638 -0.0482361550
#7 ABC 1988-04-30 0.13697 0.9530675 -0.0469325241
#8 ABC 1988-05-31 2.57786 0.9776362 -0.0223637789
#9 ABC 1988-06-30 2.36947 1.0008010 0.0008010181
#10 ABC 1988-07-31 0.57114 1.0065170 0.0065169930
#11 ABC 1988-08-31 -1.21550 0.9942828 -0.0057172210
#12 ABC 1988-09-30 7.09027 1.0647801 0.0647801126
#13 ABC 1988-10-31 3.45807 1.1016010 0.1016009542
#14 ABC 1988-11-30 1.12679 1.1140137 0.1140136836
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With