What is the range of Scikit-Learn's IsolationForest decision_function scores?

Tags:

Scikit-Learn's IsolationForest class has a method decision_function that returns the anomaly scores of the input samples. However, the documentation does not state what the possible range of these scores is, and only states that "the lower [the score], the more abnormal."

Edit: after reading jmunsch's comment I looked at the source code again and here is my updated guess: If the exponent in the scores formula is always negative, then scores will always be between 0 and 1, which would mean the returned range is [-0.5, 0.5] since 0.5 - scores is returned by the method. But I'm not certain if the exponent would always be negative.

383

asked Jul 20 '17 19:07

DataMan

1 Answers

In Scikit-Learn's IsolationForest the decision_function returns values in the range of [-0.5, 0.5] where -.5 is the most anomalous.

Or so I believe and have never seen evidence otherwise. The documentation for Scikit-Learn's IsolationForest references a paper Isolation-based Anomaly Detection by Liu et al. where equation 2 defines the anomaly score. In the paper the anomaly score ranges between 0 and 1, where 1 is most anomalous. In the scores function you reference on line 267 the variable depths.mean(axis=1) corresponds to E(h(x)) and _average_path_length(self.max_samples_)) corresponds to c(psi) in the paper. Thus on line 272 when the function returns 1 minus the score we get the bounds of [-0.5, 0.5].

Edit/Bonus: The predict method of isolation forest effectively is just comparing the decision_function values to a threshold that is stored in model.threshold_. So after calling the model's predict method on some data the anomalous items are the same items that meet the criteria:model.decision_function(data) < model.threshold_.

164

answered Nov 15 '22 00:11

Alex

Related questions
                            
                                How to set a timeout for Input
                            
                                python - web scraping an ajax website using BeautifulSoup
                            
                                How to call ctypes functions that use pointer to return value in Numba @jit
                            
                                How to find rows with overlapping date ranges?
                            
                                Selenium3.4.0-Python3.6.1 : In Selenium-Python binding using unittest how do I decide when to use self.assertIn or assert
                            
                                pandas groupby weighted cumulative sum
                            
                                Telethon, how to get an entity?
                            
                                Many emoji characters are not read by python file read
                            
                                Get unix file type with Python os module
                            
                                How does Python share memory among multiple processes?
                            
                                Python subclass counter
                            
                                Pass a header from nginx to uWSGI backend running a Flask application
                            
                                Unable to retrieve Chinese texts while scraping
                            
                                How do I set an environment variable for airflow to use?
                            
                                pyconfig.h - Cannot open include file: 'io.h': No such file or directory
                            
                                Enforcing in-memory transposition of a numpy array [duplicate]
                            
                                Keras: Cannot Import Name np_utils [duplicate]
                            
                                python: use assert to raise different Error types
                            
                                What is the meaning of X[:,:,:,i] in numpy?
                            
                                How to balance a chemical equation in Python 2.7 Using matrices

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the range of Scikit-Learn's IsolationForest decision_function scores?

Tags:

python

scikit-learn

anomaly-detection

DataMan

People also ask

1 Answers

Alex

Recent Activity

Donate For Us