What does get_fscore() of an xgboost ML model do? [duplicate]

1 Answers

This is a metric that simply sums up how many times each feature is split on. It is analogous to the Frequency metric in the R version.https://cran.r-project.org/web/packages/xgboost/xgboost.pdf

It is about as basic a feature importance metric as you can get.

i.e. How many times was this variable split on?

The code for this method shows it is simply adding of the presence of a given feature in all the trees.

[here..https://github.com/dmlc/xgboost/blob/master/python-package/xgboost/core.py#L953][1]

def get_fscore(self, fmap=''):
    """Get feature importance of each feature.
    Parameters
    ----------
    fmap: str (optional)
       The name of feature map file
    """
    trees = self.get_dump(fmap)  ## dump all the trees to text
    fmap = {}                    
    for tree in trees:              ## loop through the trees
        for line in tree.split('\n'):     # text processing
            arr = line.split('[')
            if len(arr) == 1:             # text processing 
                continue
            fid = arr[1].split(']')[0]    # text processing
            fid = fid.split('<')[0]       # split on the greater/less(find variable name)

            if fid not in fmap:  # if the feature id hasn't been seen yet
                fmap[fid] = 1    # add it
            else:
                fmap[fid] += 1   # else increment it
    return fmap                  # return the fmap, which has the counts of each time a  variable was split on

113

answered Nov 15 '22 15:11

T. Scharf

Related questions
                            
                                Record where files are opened to debug "ResourceWarning: unclosed file"
                            
                                Pythonic way to combine `for` and `try` blocks
                            
                                pyshark - data from TCP packet
                            
                                How do I watch a file, not a directory for changes using Python?
                            
                                pip3 list comes AssertionError
                            
                                Strange behavior of tuple indexing a numpy array
                            
                                Python unittest, do something only if test fails
                            
                                Order of functions within a Python source file [closed]
                            
                                Efficiently count the number of occurrences of unique subarrays in NumPy?
                            
                                Floating Bar Chart
                            
                                Which paths does python ctypes module search for libraries on Mac OS?
                            
                                Meta commands in Psycopg2 - \d not working
                            
                                How to get value name of a python protobuf message's enum field
                            
                                python DEAP genetic algorithm multi-core speed
                            
                                sklearn issue: Found arrays with inconsistent numbers of samples when doing regression
                            
                                How to unit test a form submission when multiple forms on a route?
                            
                                Mock with submodules for ReadTheDocs
                            
                                How to install gssapi python module on windows?
                            
                                How do I draw edge labels for MultiGraph in NetworkX?
                            
                                List sql tables in pandas.read_sql

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What does get_fscore() of an xgboost ML model do? [duplicate]

Tags:

python

feature-selection

xgboost

Peter Lenaers

People also ask

1 Answers

T. Scharf

Recent Activity

Donate For Us