Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expert System (?) algorithm

I have an algorithmic problem which can be reduced to this task:

Suppose we have a list of n diseases and m symptoms.
For each disease d and symptom s, we have one of three options:

  • the symptom is positively correlated with the disease: s => d
  • the symptom is negatively correlated with the disease: s => ~d
  • the symptom is uncorrelated with the disease

The goal of the algorithm is to create a list of yes/no questions regarding symptoms (or even better - a binary tree of questions), which can deduce the exact disease according to the symptoms.

Any references to specific algorithms, relevant software tools and even domain-specific jargon would be very appreciated.

like image 201
adamk Avatar asked Nov 04 '10 15:11

adamk


People also ask

What are the 4 components of expert system?

An expert system generally consists of four components: a knowledge base, the search or inference system, a knowledge acquisition system, and the user interface or communication system.

What are the 5 parts of an expert system?

An expert system has five basic components: knowledge base, inference engine, explanation component, user interface, and acquisition component.

What is expert system system?

An expert system is a computer program that uses artificial intelligence (AI) technologies to simulate the judgment and behavior of a human or an organization that has expertise and experience in a particular field. Expert systems are usually intended to complement, not replace, human experts.

What is expert system with example?

Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if-then rules rather than through conventional procedural code. Expert systems have specific knowledge to one problem domain, e.g., medicine, science, engineering, etc.


3 Answers

You case use a decision tree : http://en.wikipedia.org/wiki/Decision_tree_learning

Basically finding the optimum tree (ie which will minimize the average number of questions asked before you can identify the desease) is NP-hard.

You can can use a greedy algorithm and then try to optimise it (if you need it).

At each step you want to reduce at much as possible the number of deceases that are still "possible".

You are at the top of your tree and so you have three possible options for a given s, count the number of diseases in each option : pc nc uc.

On one side you have pc on the other nc and the uc you can't say anything (you can look at two level of your tree to have some info but for now we don't do that).

So worst case scenario, you have pc / nc + uc or pc + uc / nc, choose the s that minimize worst case (ie : lot of one side, only a few on the other).

You need to minimize abs( pc - (nc + uc)) + abs ( (pc+uc) - nc).

You now have your s for your first question and you can iteratively build your tree.

like image 108
Loïc Février Avatar answered Sep 25 '22 02:09

Loïc Février


Is your domain really this 'binary' or in fact, would you more likely want to use the correlation coefficient for each symptom/disease pair as a numeric value? This would allow strong correlations to influence the result more than weak correlations.

If so then you might want to look at Support Vector Machines that analyze data and recognize patterns.

like image 20
Ian Mercer Avatar answered Sep 21 '22 02:09

Ian Mercer


The problem is very similar to the bacteria / antibiotic question of Mycin, a forerunner to more generalized rule-based expert systems technology of the 1980s. There were other medical diagnostic programs developed with the resulting tools.

http://en.wikipedia.org/wiki/Mycin

like image 38
Roger F. Gay Avatar answered Sep 24 '22 02:09

Roger F. Gay