Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drools performance for decision tables

Tags:

java

drools

I have a potential performance/memory bottleneck when I try to calculate insurance premium using Drools engine.

I use Drools in my project to separate business logic from java code and I decided to use it for premium calculation too.

  • Am I using Drools the wrong way?
  • How to meet the requirements in more performant way?

Details below:


Calculations

I have to calculate insurance premium for given contract.

Contract is configured with

  • productCode (code from dictionary)
  • contractCode (code from dictionary)
  • client’s personal data (e.g. age, address)
  • insurance sum (SI)
  • etc.

At the moment, premium is calculated using this formula:

premium := SI * px * (1 + py) / pz

where:

  • px is factor parameterized in Excel file and depends on 2 properties (client’s age and sex)
  • py is factor parameterized in excel file and depends on 4 contract’s properties
  • pz - similarly

Requirements

  • R1 – java code doesn’t know the formula,
  • R2 - java code knows nothing about formula dependencies, in other words that premium depends on: px, py, pz,
  • R3 - java code knows nothing about parameters’ dependencies, I mean that px depends on client’s age and sex, and so on.

With R1, R2 and R3 implemented I have java code in separation from business logic, and any business analyst (BA) may modify formula and add new dependencies without redeploys.


My solution, so far

I have contract domain model, which consists of classes Contract, Product, Client, Policy and so on. Contract class is defined as:

public class Contract {

    String code;           // contractCode
    double sumInsured;     // SI
    String clientSex;      // M, F
    int professionCode;    // code from dictionary
    int policyYear;        // 1..5
    int clientAge;         // 
    ...                    // etc.

In addition I introduced Var class that is container for any parameterized variable:

public class Var {

    public final String name;
    public final ContractPremiumRequest request;

    private double value;       // calculated value
    private boolean ready;      // true if value is calculated

    public Var(String name, ContractPremiumRequest request) {
        this.name = name;
        this.request = request;
    }

    ... 
    public void setReady(boolean ready) {
        this.ready = ready;
        request.check();
    }

    ...
    // getters, setters
}

and finally - request class:

public class ContractPremiumRequest {

    public static enum State {
        INIT,
        IN_PROGRESS,
        READY
    }

    public final Contract contract;

    private State state = State.INIT;

    // all dependencies (parameterized factors, e.g. px, py, ...)
    private Map<String, Var> varMap = new TreeMap<>();

    // calculated response - premium value 
    private BigDecimal value;

    public ContractPremiumRequest(Contract contract) {
        this.contract = contract;
    }

    // true if *all* vars are ready
    private boolean _isReady() {
        for (Var var : varMap.values()) {
            if (!var.isReady()) {
                return false;
            }
        }
        return true;
    }

    // check if should modify state
    public void check() {
        if (_isReady()) {
            setState(State.READY);
        }
    }

    // read number from var with given [name]
    public double getVar(String name) {
        return varMap.get(name).getValue();
    }

    // adding uncalculated factor to this request – makes request IN_PROGRESS
    public Var addVar(String name) {
        Var var = new Var(name, this);
        varMap.put(name, var);

        setState(State.IN_PROGRESS);
        return var;
    }

    ...
    // getters, setters
}

Now I can use these classes with such flow:

  1. request = new ContractPremiumRequest(contract)
    • creates request with state == INIT
  2. px = request.addVar( "px" )
    • creates Var("px") with ready == false
    • moves request to state == IN_PROGRESS
  3. py = request.addVar( "py" )
  4. px.setValue( factor ), px.setReady( true )
    • set calculated value on px
    • makes it ready == true
  5. request.check() makes state == READY if ALL vars are ready
  6. now we can use formula, as request has all dependencies calculated

I have created 2 DRL rules and prepared 3 decision tables (px.xls, py.xls, ...) with factors provided by BA.

Rule1 - contract_premium_prepare.drl:

rule "contract premium request - prepare dependencies"
when
  $req : ContractPremiumRequest (state == ContractPremiumRequest.State.INIT)
then
  insert( $req.addVar("px") ); 
  insert( $req.addVar("py") ); 
  insert( $req.addVar("pz") ); 
  $req.setState(ContractPremiumRequest.State.IN_PROGRESS);
end

Rule2 - contract_premium_calculate.drl:

rule "contract premium request - calculate premium"
when 
  $req : ContractPremiumRequest (state == ContractPremiumRequest.State.READY)
then 
  double px = $req.getVar("px"); 
  double py = $req.getVar("py");
  double pz = $req.getVar("pz");
  double si = $req.contract.getSumInsured();  

  // use formula to calculate premium 
  double premium = si * px * (1 + py) / pz; 

  // round to 2 digits 
  $req.setValue(premium);
end

Decision table px.xls:

Fragment from px.xls decision table

Decision table py.xls:

Fragment from px.xls decision table

KieContainer is constructed once on startup:

dtconf = KnowledgeBuilderFactory.newDecisionTableConfiguration();
dtconf.setInputType(DecisionTableInputType.XLS);
KieServices ks = KieServices.Factory.get();
KieContainer kc = ks.getKieClasspathContainer();

Now to calculate premium for given contract we write:

ContractPremiumRequest request = new ContractPremiumRequest(contract);  // state == INIT
kc.newStatelessKieSession("session-rules").execute(request);
BigDecimal premium = request.getValue();

This is what happens:

  • Rule1 fires for ContractPremiumRequest[INIT]
  • this rule creates and adds px, py and pz dependencies (Var objects)
  • proper excel row fires for each px, py, pz object and makes it ready
  • Rule2 fires for ContractPremiumRequest[READY] and use formula

Volumes

  • PX decision table has ~100 rows,
  • PY decision table has ~8000 rows,
  • PZ decision table has ~50 rows.

My results

  • First calculation, which loads and initializes decision tables takes ~45 seconds – this might become problematic.

  • Each calculation (after some warmup) takes ~0.8 ms – which is acceptable for our team.

  • Heap consumption is ~150 MB – which is problematic as we expect much more big tables will be used.


Question

  • Am I using Drools the wrong way?
  • How to meet the requirements in more performant way?
  • How to optimize memory usage?

       

========== EDIT (after 2 years) ==========

This is a short summary after 2 years.

Our system has grown very much, as we expected. We have ended with more then 500 tables (or matrices) with insurance pricing, actuarial factors, coverage configs etc. Some tables are more than 1 million rows in size. We used drools but we couldn't handle performance problems.

Finally we have used Hyperon engine (http://hyperon.io)

This system is a beast - it allows us to run hundreds rule matches in approx 10 ms total time.

We were even able to trigger full policy recalculation on every KeyType event on UI fields.

As we have learnt, Hyperon uses fast in-memory indexes for each rule table and these indexes are somehow compacted so they offer almost no memory footprint.

We have one more benefit now - all pricing, factors, config tables can be modified on-line (both values and structure) and this is fully transparent to java code. Application just continues to work with new logic, no development or restart is needed.

However we have needed some time and effort to get to know Hyperon well enough :)

I have found some comparison made by our team a year ago - it shows engine initialization (drools/hyperon) and 100k simple calculations from jvisualVM perspective:

Policy calculation with drools Policy calculation with hyperon

like image 344
przemek hertel Avatar asked Sep 01 '16 16:09

przemek hertel


People also ask

How can you improve drools performance of rule execution?

Preparing data before calling insert() would give you better performance. Converting drl/xls is done by KnowledgeBuilder for the first time. KnowledgeBase will be cached in an application so it wouldn't be a bottle-neck. Drools 5 evaluates rule condition during the insert stage (ksession.

How do you make a decision table in drools?

A Drool Decision Table is a way to generate rules from the data entered into a spreadsheet. The spreadsheet can be a standard Excel (XLS) or a CSV File. In this spreadsheet a simple rule is included: if the Customer object's age parameter equals to “1” the Customer is allowed a discount of 15%.

What is DRL file in drools?

DRL (Drools Rule Language) rules are business rules that you define directly in . drl text files. These DRL files are the source in which all other rule assets in Business Central are ultimately rendered.

How do you use salience in drools?

Setting a Priority Salience is a keyword in the . drl file that we can assign a positive or negative number. The number determines the salience priority. A higher number denotes a higher priority, so rules with a higher salience will be executed first by the Drools engine.

What is a drool decision table?

A Drool Decision Table is a way to generate rules from the data entered into a spreadsheet. The spreadsheet can be a standard Excel (XLS) or a CSV File. In a Drool Decision Table, each row is a rule and each column in that row is either a condition or action for that rule.

How to use Drools spreadsheet decision table with Kie Workbench?

Use drools spreadsheet decision table so that it can be “version controlled” by KIE workbench; When drools using rule template + Excel to fire rules, what it actually doing under hood is: Using ExternalSpreadsheetCompiler to compile rule template and rule data ( ie the Excel file) into drl (Drools rule language)

How to precompile formatted Drools decision table with Kie-Maven?

precompile-rule-solution — Use Kie-Maven-Plugin to precompile Formatted Drools Decision Table row-as-fact-solution — Use Large row data as Fact instead of Rules as a solution; For an overview, the performance comparison in my demonstration code is: (Warm up time includes: Load Rules & Facts, Xls, create kiesession etc)

What are the advantages of using Drools?

Two obvious advantages we have gained by applying Drools executable models. Runtime performance is obviously improved. Spreadsheet decision table can be governed by 'kie-workbench.' Sometimes this is not obvious to some users when they start to adopt rules oriented application framework.


1 Answers

The problem is that you have created a huge amount of code (all the rules resulting from the tables) for what is a relatively small amount of data. I have seen similar cases, and they all benefited from inserting the tables as data. PxRow, PyRow and PzRow should be defined like this:

class PxRow { 
    private String gender;
    private int age;
    private double px;
    // Constructor (3 params) and getters
} 

Data can still be in (simpler) spreadsheets or anything else you fancy for data entry by the BA boffins. You insert all rows as facts PxRow, PyRow, PzRow. Then you need one or two rules:

rule calculate
when 
    $c: Contract( $cs: clientSex, $ca: clientAge,
                  $pc: professionCode, $py: policyYear,...
                  ...
                  $si: sumInsured )

    PxRow( gender == $cs, age == $ca, $px: px )
    PyRow( profCode == $pc, polYear == $py,... $py: py )
    PzRow( ... $pz: pz )
then
    double premium = $si * $px * (1 + $py) / $pz; 
    // round to 2 digits 
    modify( $c ){ setPremium( premium ) }
end

Forget the flow and all the other decorations. But you may need another rule just in case your Contract doesn't match Px or Py or Pz:

rule "no match"
salience -100
when
    $c: Contract( premium == null ) # or 0.00
then
    // diagnostic
end
like image 140
laune Avatar answered Oct 06 '22 16:10

laune