Reality description: We do have a list of projects. In every project there is a lot of accounts. You can do a lot of actions on every account. I do have the following dimensions and fact table defined (simplified):
Dimensions and attributes:
Project
project_key
project_name
industry
number_of_accounts
Distance
distance_key
distance_in_months
distance_in_quarters
Account
account_key
project_key
account_id
Fact Table and attributes:
Action_Fact_Table
project_key
distance_key
account_key
action_id
Now, I would like to use run-off triangle approach to analyze data (it's maybe not the real run-off triangle, but the approach is the same). The most simple triangle would look like:
Distance in Months
Project name| 1 2 3 4 5 6 7 8 9 10
-------------------------------------------------------------------------
Project1 | 5 10 15 20 25 30 35 40 45 50
Project2 | 7 14 21 28 35 42 49 56 63
Project3 | 2 5 8 11 14 20 25 30
Project4 | 0 2 5 10 18 23 40
Project5 | 5 12 18 20 21 30
There is running sum of number of actions by rows. Distance in months shows the distance between date of action and project start date. You can obviously create similar triangle using distance in quarters (or any other period defined in distance dimension).
You can also create triangel for different level of hierarchy in project dimension, e.g. industry (Project1-Project3 = Industry1, Project4-Project5 = Industry2):
Distance in Months
Project name| 1 2 3 4 5 6 7 8 9 10
-------------------------------------------------------------------------
Industry1 | 14 29 44 59 74 92 109 126 108 50
Industry2 | 5 14 23 30 39 53 40
There is also more advanced run-off triangle where you divide running sum of actions by number of accounts. Assume there is the following number of accounts for our projects:
Project_name number_of_accounts
-----------------------------
Project1 100
Project2 100
Project3 100
Project4 100
Project5 200
Then I would like to get the following triangle:
Distance in Months
Project | 1 2 3 4 5 6 7 8 9 10
------------------------------------------------------------------------
Project1 | .05 .01 .15 .20 .25 .30 .35 .40 .45 .50
Project2 | .7 .14 .21 .28 .35 .42 .49 .56 .63
Project3 | .2 .5 .8 .11 .14 .20 .25 .30
Project4 | .0 .2 .5 .10 .18 .23 .40
Project5 | .05 .06 .09 .10 .105 .15
This is especially usefull when you would like to compare projects and their actions in case where number of accounts in project is not the same for all projects.
The question is whether it is possible to create such calculation in OLAP. I was thinking I can use number_of_accounts in project table, but I can't figure it out. The other option is to aggregate data in account dimension. I was also not able to find anything using google, maybe because I'm asking wrong question.
The solution to this question is widely applicable in many industries, it is crucial especially in insurance and banking. It can be used everywhere where processes have long performance window and can be tracked by well defined, comparable batches of units.
(We are using PostgreSQL, Saiku, cubes are defined in Schema Workbench)
Test data (PostgreSQL syntax, let me know if you need something else)
--drop table if exists project cascade;
create table project (
project_key int primary key,
project_name character varying,
industry character varying,
number_of_accounts int
);
--drop table if exists distance cascade;
create table distance (
distance_key int primary key,
distance_in_months int,
distance_in_quarters int);
--drop table if exists account cascade;
create table account (
account_key int primary key,
project_key int references project (project_key)
);
--drop table if exists action_fact_table cascade;
create table action_fact_table (
project_key int references project (project_key),
distance_key int references distance (distance_key),
account_key int references account (account_key),
action_id int
);
-- project data
insert into project values (1,'Project1','Industry1',100);
insert into project values (2,'Project2','Industry1',100);
insert into project values (3,'Project3','Industry1',100);
insert into project values (4,'Project4','Industry2',100);
insert into project values (5,'Project5','Industry2',200);
-- distance data
insert into distance values(1,1,1);
insert into distance values(2,2,1);
insert into distance values(3,3,1);
insert into distance values(4,4,2);
insert into distance values(5,5,2);
insert into distance values(6,6,2);
insert into distance values(7,7,3);
insert into distance values(8,8,3);
insert into distance values(9,9,3);
insert into distance values(10,10,4);
insert into distance values(11,11,4);
insert into distance values(12,12,4);
-- account data
/* let me know if you need insert statement for every row */
insert into account (
select generate_series (1,100), 1 union all
select generate_series (101,200), 2 union all
select generate_series (201,300), 3 union all
select generate_series (301,400), 4 union all
select generate_series (401,600), 5
);
insert into action_fact_table values(1,1,90,10001);
insert into action_fact_table values(1,1,32,10002);
insert into action_fact_table values(1,1,41,10003);
insert into action_fact_table values(1,1,54,10004);
insert into action_fact_table values(1,1,45,10005);
insert into action_fact_table values(1,2,22,10006);
insert into action_fact_table values(1,2,29,10007);
insert into action_fact_table values(1,2,41,10008);
insert into action_fact_table values(1,2,89,10009);
insert into action_fact_table values(1,2,15,10010);
insert into action_fact_table values(1,3,32,10011);
insert into action_fact_table values(1,3,100,10012);
insert into action_fact_table values(1,3,72,10013);
insert into action_fact_table values(1,3,80,10014);
insert into action_fact_table values(1,3,10,10015);
insert into action_fact_table values(1,4,12,10016);
insert into action_fact_table values(1,4,45,10017);
insert into action_fact_table values(1,4,83,10018);
insert into action_fact_table values(1,4,42,10019);
insert into action_fact_table values(1,4,33,10020);
insert into action_fact_table values(1,5,22,10021);
insert into action_fact_table values(1,5,27,10022);
insert into action_fact_table values(1,5,59,10023);
insert into action_fact_table values(1,5,32,10024);
insert into action_fact_table values(1,5,70,10025);
insert into action_fact_table values(1,6,32,10026);
insert into action_fact_table values(1,6,5,10027);
insert into action_fact_table values(1,6,15,10028);
insert into action_fact_table values(1,6,70,10029);
insert into action_fact_table values(1,6,43,10030);
insert into action_fact_table values(1,7,59,10031);
insert into action_fact_table values(1,7,9,10032);
insert into action_fact_table values(1,7,99,10033);
insert into action_fact_table values(1,7,79,10034);
insert into action_fact_table values(1,7,31,10035);
insert into action_fact_table values(1,8,56,10036);
insert into action_fact_table values(1,8,34,10037);
insert into action_fact_table values(1,8,48,10038);
insert into action_fact_table values(1,8,79,10039);
insert into action_fact_table values(1,8,42,10040);
insert into action_fact_table values(1,9,10,10041);
insert into action_fact_table values(1,9,10,10042);
insert into action_fact_table values(1,9,49,10043);
insert into action_fact_table values(1,9,61,10044);
insert into action_fact_table values(1,9,49,10045);
insert into action_fact_table values(1,10,99,10046);
insert into action_fact_table values(1,10,69,10047);
insert into action_fact_table values(1,10,84,10048);
insert into action_fact_table values(1,10,99,10049);
insert into action_fact_table values(1,10,3,10050);
insert into action_fact_table values(2,1,182,10051);
insert into action_fact_table values(2,1,127,10052);
insert into action_fact_table values(2,1,197,10053);
insert into action_fact_table values(2,1,174,10054);
insert into action_fact_table values(2,1,187,10055);
insert into action_fact_table values(2,1,144,10056);
insert into action_fact_table values(2,1,160,10057);
insert into action_fact_table values(2,2,155,10058);
insert into action_fact_table values(2,2,153,10059);
insert into action_fact_table values(2,2,119,10060);
insert into action_fact_table values(2,2,188,10061);
insert into action_fact_table values(2,2,125,10062);
insert into action_fact_table values(2,2,147,10063);
insert into action_fact_table values(2,2,123,10064);
insert into action_fact_table values(2,3,136,10065);
insert into action_fact_table values(2,3,163,10066);
insert into action_fact_table values(2,3,187,10067);
insert into action_fact_table values(2,3,138,10068);
insert into action_fact_table values(2,3,168,10069);
insert into action_fact_table values(2,3,132,10070);
insert into action_fact_table values(2,3,138,10071);
insert into action_fact_table values(2,4,158,10072);
insert into action_fact_table values(2,4,171,10073);
insert into action_fact_table values(2,4,153,10074);
insert into action_fact_table values(2,4,141,10075);
insert into action_fact_table values(2,4,182,10076);
insert into action_fact_table values(2,4,165,10077);
insert into action_fact_table values(2,4,143,10078);
insert into action_fact_table values(2,5,190,10079);
insert into action_fact_table values(2,5,181,10080);
insert into action_fact_table values(2,5,163,10081);
insert into action_fact_table values(2,5,134,10082);
insert into action_fact_table values(2,5,145,10083);
insert into action_fact_table values(2,5,190,10084);
insert into action_fact_table values(2,5,198,10085);
insert into action_fact_table values(2,6,137,10086);
insert into action_fact_table values(2,6,133,10087);
insert into action_fact_table values(2,6,135,10088);
insert into action_fact_table values(2,6,103,10089);
insert into action_fact_table values(2,6,187,10090);
insert into action_fact_table values(2,6,127,10091);
insert into action_fact_table values(2,6,117,10092);
insert into action_fact_table values(2,7,116,10093);
insert into action_fact_table values(2,7,139,10094);
insert into action_fact_table values(2,7,111,10095);
insert into action_fact_table values(2,7,150,10096);
insert into action_fact_table values(2,7,151,10097);
insert into action_fact_table values(2,7,181,10098);
insert into action_fact_table values(2,7,109,10099);
insert into action_fact_table values(2,8,102,10100);
insert into action_fact_table values(2,8,101,10101);
insert into action_fact_table values(2,8,118,10102);
insert into action_fact_table values(2,8,147,10103);
insert into action_fact_table values(2,8,186,10104);
insert into action_fact_table values(2,8,136,10105);
insert into action_fact_table values(2,8,160,10106);
insert into action_fact_table values(2,9,149,10107);
insert into action_fact_table values(2,9,119,10108);
insert into action_fact_table values(2,9,169,10109);
insert into action_fact_table values(2,9,176,10110);
insert into action_fact_table values(2,9,195,10111);
insert into action_fact_table values(2,9,183,10112);
insert into action_fact_table values(2,9,140,10113);
insert into action_fact_table values(3,1,224,10114);
insert into action_fact_table values(3,1,241,10115);
insert into action_fact_table values(3,2,295,10116);
insert into action_fact_table values(3,2,249,10117);
insert into action_fact_table values(3,2,260,10118);
insert into action_fact_table values(3,3,298,10119);
insert into action_fact_table values(3,3,267,10120);
insert into action_fact_table values(3,3,297,10121);
insert into action_fact_table values(3,4,211,10122);
insert into action_fact_table values(3,4,253,10123);
insert into action_fact_table values(3,4,214,10124);
insert into action_fact_table values(3,5,248,10125);
insert into action_fact_table values(3,5,223,10126);
insert into action_fact_table values(3,5,288,10127);
insert into action_fact_table values(3,6,207,10128);
insert into action_fact_table values(3,6,296,10129);
insert into action_fact_table values(3,6,221,10130);
insert into action_fact_table values(3,6,201,10131);
insert into action_fact_table values(3,6,227,10132);
insert into action_fact_table values(3,6,209,10133);
insert into action_fact_table values(3,7,267,10134);
insert into action_fact_table values(3,7,282,10135);
insert into action_fact_table values(3,7,215,10136);
insert into action_fact_table values(3,7,285,10137);
insert into action_fact_table values(3,7,212,10138);
insert into action_fact_table values(3,8,239,10139);
insert into action_fact_table values(3,8,294,10140);
insert into action_fact_table values(3,8,296,10141);
insert into action_fact_table values(3,8,251,10142);
insert into action_fact_table values(3,8,281,10143);
insert into action_fact_table values(4,2,392,10144);
insert into action_fact_table values(4,2,347,10145);
insert into action_fact_table values(4,3,318,10146);
insert into action_fact_table values(4,3,400,10147);
insert into action_fact_table values(4,3,378,10148);
insert into action_fact_table values(4,4,315,10149);
insert into action_fact_table values(4,4,318,10150);
insert into action_fact_table values(4,4,394,10151);
insert into action_fact_table values(4,4,382,10152);
insert into action_fact_table values(4,4,317,10153);
insert into action_fact_table values(4,5,314,10154);
insert into action_fact_table values(4,5,354,10155);
insert into action_fact_table values(4,5,338,10156);
insert into action_fact_table values(4,5,375,10157);
insert into action_fact_table values(4,5,317,10158);
insert into action_fact_table values(4,5,329,10159);
insert into action_fact_table values(4,5,342,10160);
insert into action_fact_table values(4,5,380,10161);
insert into action_fact_table values(4,6,313,10162);
insert into action_fact_table values(4,6,311,10163);
insert into action_fact_table values(4,6,336,10164);
insert into action_fact_table values(4,6,380,10165);
insert into action_fact_table values(4,6,355,10166);
insert into action_fact_table values(4,7,386,10167);
insert into action_fact_table values(4,7,322,10168);
insert into action_fact_table values(4,7,311,10169);
insert into action_fact_table values(4,7,367,10170);
insert into action_fact_table values(4,7,350,10171);
insert into action_fact_table values(4,7,384,10172);
insert into action_fact_table values(4,7,391,10173);
insert into action_fact_table values(4,7,331,10174);
insert into action_fact_table values(4,7,373,10175);
insert into action_fact_table values(4,7,314,10176);
insert into action_fact_table values(4,7,305,10177);
insert into action_fact_table values(4,7,331,10178);
insert into action_fact_table values(4,7,350,10179);
insert into action_fact_table values(4,7,376,10180);
insert into action_fact_table values(4,7,387,10181);
insert into action_fact_table values(4,7,312,10182);
insert into action_fact_table values(4,7,397,10183);
insert into action_fact_table values(5,1,404,10184);
insert into action_fact_table values(5,1,562,10185);
insert into action_fact_table values(5,1,511,10186);
insert into action_fact_table values(5,1,594,10187);
insert into action_fact_table values(5,1,541,10188);
insert into action_fact_table values(5,2,506,10189);
insert into action_fact_table values(5,2,427,10190);
insert into action_fact_table values(5,2,481,10191);
insert into action_fact_table values(5,2,463,10192);
insert into action_fact_table values(5,2,579,10193);
insert into action_fact_table values(5,2,455,10194);
insert into action_fact_table values(5,2,527,10195);
insert into action_fact_table values(5,3,465,10196);
insert into action_fact_table values(5,3,562,10197);
insert into action_fact_table values(5,3,434,10198);
insert into action_fact_table values(5,3,401,10199);
insert into action_fact_table values(5,3,464,10200);
insert into action_fact_table values(5,3,500,10201);
insert into action_fact_table values(5,4,554,10202);
insert into action_fact_table values(5,4,600,10203);
insert into action_fact_table values(5,5,483,10204);
insert into action_fact_table values(5,6,552,10205);
insert into action_fact_table values(5,6,565,10206);
insert into action_fact_table values(5,6,586,10207);
insert into action_fact_table values(5,6,544,10208);
insert into action_fact_table values(5,6,436,10209);
insert into action_fact_table values(5,6,531,10210);
insert into action_fact_table values(5,6,409,10211);
insert into action_fact_table values(5,6,524,10212);
insert into action_fact_table values(5,6,564,10213);
Sample cube (Mondrian):
<Schema name="RunoffTriangleSchema">
<Cube name="RunoffTriangleCube" visible="true" cache="true" enabled="true">
<Table name="action_fact_table" schema="public">
</Table>
<Dimension type="StandardDimension" visible="true" foreignKey="project_key" name="Project">
<Hierarchy name="Project" visible="true" hasAll="true">
<Table name="project" schema="public" alias="">
</Table>
<Level name="Industry" visible="true" column="industry" uniqueMembers="false">
</Level>
<Level name="Project Name" visible="true" column="project_name" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" visible="true" foreignKey="distance_key" name="Distance">
<Hierarchy name="Distance" visible="true" hasAll="true">
<Table name="distance" schema="public" alias="">
</Table>
<Level name="Distance In Quarters" visible="true" column="distance_in_quarters" uniqueMembers="false">
</Level>
<Level name="Distance In Months" visible="true" column="distance_in_months" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
<Dimension type="StandardDimension" visible="true" foreignKey="account_key" name="Account">
<Hierarchy name="Account" visible="true" hasAll="true">
<Table name="account" schema="public">
</Table>
<Level name="Account Key" visible="true" column="account_key" uniqueMembers="false">
</Level>
</Hierarchy>
</Dimension>
<Measure name="CountActions" column="action_id" aggregator="count" visible="true">
</Measure>
</Cube>
</Schema>
Two bounties and no answer, I am suprised. I've found a workaround solution - using SQL and BIRT engine I am now close to what I was looking for. I still hope that someone can solve this for OLAP.
To make this work, I have:
Dynamically return columns
CREATE or replace FUNCTION bizdata.getColumns(_column1 text, _column2 text, _column3 text, _column4 text, _table text, _rqdl text)
RETURNS TABLE(cmf1 text, cmf2 text, cmf3 text, outval numeric, rqdl text) AS $$
BEGIN
RETURN QUERY EXECUTE
'SELECT '
|| case when _column1 = 'None' then quote_literal('None') else quote_ident(_column1) end || '::text as cmf1,'
|| case when _column2 = 'None' then quote_literal('None') else quote_ident(_column2) end || '::text as cmf2,'
|| case when _column3 = 'None' then quote_literal('None') else quote_ident(_column3) end || '::text as cmf3,'
|| quote_ident(_column4) || '::numeric as baseline,'
|| case when _rqdl = 'None' then 0::text else quote_ident(_rqdl)::text end || '::text as rqdl'
' FROM '
|| 'bizdata.' || _table;
END;
$$ LANGUAGE plpgsql;
Thi function takes the following as input variables:
- _column1 - common mapping field number 1
- _column2 - common mapping field number 2
- _column3 - common mapping field number 3
- _column4 - column used for aggregation (sum)
- _table - table used for getting data
- _rqdl - requested distance level
Calculate data
Using bizdata.getColumns() function I can calculate triangle data using the following statement:
with
params as (
select 'cmf1'::varchar as prm_name, 'project_owner_name_short'::varchar as prm_value union all
select 'cmf2'::varchar as prm_name, 'project_source_name_short'::varchar as prm_value union all
select 'cmf3'::varchar as prm_name, 'None'::varchar as prm_value union all
select 'fact'::varchar as prm_name, 'amount'::varchar as prm_value union all
select 'fact_table'::varchar as prm_name, 'dwv_daily_allocation_fact'::varchar as prm_value union all
select 'baseline'::varchar as prm_name, 'tmp_nominal_value'::varchar as prm_value union all
select 'baseline_table'::varchar as prm_name, 'dw_project'::varchar as prm_value union all
select 'rqdl'::varchar as prm_name, 'year_distance'::varchar as prm_value
)
,baseline_data as (
select
cmf1,
cmf2,
cmf3,
sum(coalesce(outval,0)) as baseline
from
bizdata.getColumns(
(select prm_value from params where prm_name = 'cmf1'::text),
(select prm_value from params where prm_name = 'cmf2'::text),
(select prm_value from params where prm_name = 'cmf3'::text),
(select prm_value from params where prm_name = 'baseline'::text),
(select prm_value from params where prm_name = 'baseline_table'::text),
'None'
)
group by
cmf1,
cmf2,
cmf3
)
,fact_data as (
select
cmf1,
cmf2,
cmf3,
rqdl::int as rqdl,
sum(coalesce(outval,0)) as fact
from
bizdata.getColumns(
(select prm_value from params where prm_name = 'cmf1'::text),
(select prm_value from params where prm_name = 'cmf2'::text),
(select prm_value from params where prm_name = 'cmf3'::text),
(select prm_value from params where prm_name = 'fact'::text),
(select prm_value from params where prm_name = 'fact_table'::text),
(select prm_value from params where prm_name = 'rqdl'::text)
)
group by
cmf1,
cmf2,
cmf3,
rqdl
)
select
case when cmf1 = 'None' then null else cmf1 end as cmf1,
case when cmf2 = 'None' then null else cmf1 end as cmf,
case when cmf3 = 'None' then null else cmf1 end as cmf1,
rqdl,
fact,
baseline,
sum(fact) over (partition by cmf1, cmf2, cmf3 order by rqdl) as cfact,
sum(fact) over (partition by cmf1, cmf2, cmf3 order by rqdl) / baseline as cfactpct
from
fact_data
join baseline_data using (cmf1, cmf2, cmf3)
You can see that I can use up to 3 grouping variables (cmf1, cmf2, cmf3) and choose any distance attribute (as long as the attribute is available in dwv_daily_allocation_fact. Grouping variables should be available both in baseline table and fact table (to get common group level)
Report
Last step is to create report in BIRT (2.6.1) where parameters in params part of SQL are replaced by dataset parameters and linked to report parameters. Those who are using BIRT probably understand, other have to find other way.
Parameters selection GUI
Output report
I still have to figure out correct sorting of table (so groups with longest history are first.
Edit: I've figured out sorting in BIRT crosstab, now it looks like real triangle:
Let me know if you need more detailed description on how I did this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With