Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inference engine to calculate matching set according to internal rules

I have a set of objects with attributes and a bunch of rules that, when applied to the set of objects, provides a subset of those objects. To make this easier to understand I'll provide a concrete example.

My objects are persons and each has three attributes: country of origin, gender and age group (all attributes are discrete). I have a bunch of rules, like "all males from the US", which correspond with subsets of this larger set of objects.

I'm looking for either an existing Java "inference engine" or something similar, which will be able to map from the rules to a subset of persons, or advice on how to go about creating my own. I have read up on rule engines, but that term seems to be exclusively used for expert systems that externalize the business rules, and usually doesn't include any advanced form of inferencing. Here are some examples of the more complex scenarios I have to deal with:

  1. I need the conjunction of rules. So when presented with both "include all males" and "exclude all US persons in the 10 - 20 age group," I'm only interested in the males outside of the US, and the males within the US that are outside the 10 - 20 age group.

  2. Rules may have different priorities (explicitly defined). So a rule saying "exclude all males" will override a rule saying "include all US males."

  3. Rules may be conflicting. So I could have both an "include all males" and an "exclude all males" in which case the priorities will have to settle the issue.

  4. Rules are symmetric. So "include all males" is equivalent to "exclude all females."

  5. Rules (or rather subsets) may have meta rules (explicitly defined) associated with them. These meta rules will have to be applied in any case that the original rule is applied, or if the subset is reached via inferencing. So if a meta rule of "exclude the US" is attached to the rule "include all males", and I provide the engine with the rule "exclude all females," it should be able to inference that the "exclude all females" subset is equivalent to the "include all males" subset and as such apply the "exclude the US" rule additionally.

I can in all likelihood live without item 5, but I do need all the other properties mentioned. Both my rules and objects are stored in a database and may be updated at any stage, so I'd need to instantiate the 'inference engine' when needed and destroy it afterward.

like image 282
Zecrates Avatar asked May 24 '10 13:05

Zecrates


2 Answers

For the case you're describing I think you'll want to use backwards-chaining, rather than forward chaining (RETE systems like Drools are forward-chaining, in their default behavior).

Check out tuProlog. Easy to bind with Java, 100% pure Java, and can definitely do the inferencing you want. You'll need to understand enough about Prolog to characterize your rule set.

Prova can also do inferencing and handle complex rule systems.

like image 29
Ross Judson Avatar answered Sep 28 '22 22:09

Ross Judson


There are a bunch of embedded Prolog-like SLD solvers for Java; my favourite approach is to use mini-Kanren for Scala, since that is clean and allows you to use Scala to lazily handle the results of queries, but I have not used it in any depth. See Embedded Prolog Interpreter/Compiler for Java for other options, as well as Ross' answer.

SLD solvers handle all of your criteria, provided they have some extra features that Prolog has:

  1. Conjunction of rules: Basic SLD goal processing;
  2. Rules may have different priorities: Prolog's cut rule allows representation of negation, provided the queries are decidable;
  3. Rules may be conflicting: Again, with cut you can ensure that lower priority clauses are not applied if higher priority goals are satisfied. There are a few ways to go about doing this.
  4. Rules are symmetric: With cut, this is easily ensured for decidable predicates.
  5. Rules (or rather subsets) may have meta rules (explicitly defined) associated with them: your example seems to suggest this is equivalent to 4, so I'm not sure I get what you are after here.

The advantages and disadvantages of SLD solvers over description logic-based tools are:

  1. Programmatic power, flexibility: you can generally find programming solutions to modelling difficulties, where description logics might require you to rethink your models. But of course absence of duct-tape means that description logic solutions force you to be clean, which might be a good discipline.
  2. Robustness: SLD solvers are a very well understood technology, while description logic tools are often not many steps from their birth in a PhD thesis.
  3. Absence of semantic tools: description logic has nice links with first-order logic and model logic, and gives you a very rich set of techniques to reason about them. The flexibility of Prolog typically makes this very hard.

If you do not have special expertise in description logic, I'd recommend an SLD solver.

like image 114
Charles Stewart Avatar answered Sep 28 '22 23:09

Charles Stewart