Use both sample_weight and class_weight simultaneously

Tags:

My dataset already has weighted examples. And in this binary classification I also have far more of the first class compared to the second.

Can I use both sample_weight and further re-weight it with class_weight in the model.fit() function?

Or do I first make a new array of new_weights and pass it to the fit function as sample_weight?

Edit:

TO further clarify, I already have individual weights for each sample in my dataset, and to further add to the complexity, the total sum of sample weights of the first class is far more than the total sample weights of the second class.

For example I currently have:

y = [0,0,0,0,1,1]

sample_weights = [0.01,0.03,0.05,0.02, 0.01,0.02]

so the sum of weights for class '0' is 0.11 and for class '1' is 0.03. So I should have:

class_weight = {0 : 1. , 1: 0.11/0.03}

I need to use both sample_weight AND class_weight features. If one overrides the other then I will have to create new sample_weights and then use fit() or train_on_batch().

So my question is, can I use both, or does one override the other?

299

asked Jan 09 '18 16:01

user7867665

2 Answers

You can surely do both if you want, the thing is if that is what you need. According to the keras docs:

class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.

sample_weight: Optional Numpy array of weights for the training samples, used for weighting the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data [...].

So given that you mention that you "have far more of the first class compared to the second" I think that you should go for the class_weight parameter. There you can indicate that ratio your dataset presents so you can compensate for imbalanced data classes. The sample_weight is more when you want to define a weight or importance for each data element.

For example if you pass:

class_weight = {0 : 1. , 1: 50.}

you will be saying that every sample from class 1 would count as 50 samples from class 0, therefore giving more "importance" to your elements from class 1 (as you have less of those samples surely). You can custom this to fit your own needs. More info con imbalanced datasets on this great question.

Note: To further compare both parameters, have in mind that passing class_weight as {0:1., 1:50.} would be equivalent to pass sample_weight as [1.,1.,1.,...,50.,50.,...], given you had samples whose classes where [0,0,0,...,1,1,...].

As we can see it is more practical to use class_weight on this case, and sample_weight could be of use on more specific cases where you actually want to give an "importance" to each sample individually. Using both can also be done if the case requires it, but one has to have in mind its cumulative effect.

Edit: As per your new question, digging on the Keras source code it seems that indeed sample_weights overrides class_weights, here is the piece of code that does it on the _standarize_weigths method (line 499):

if sample_weight is not None:
    #...Does some error handling...
    return sample_weight #simply returns the weights you passed

elif isinstance(class_weight, dict):
    #...Some error handling and computations...
    #Then creates an array repeating class weight to match your target classes
    weights = np.asarray([class_weight[cls] for cls in y_classes
                          if cls in class_weight])

    #...more error handling...
    return weights

This means that you can only use one or the other, but not both. Therefore you will indeed need to multiply your sample_weights by the ratio you need to compensate for the imbalance.

Update: As of the moment of this edit (March 27, 2020), looking at the source code of training_utils.standardize_weights() we can see that it now supports both class_weights and sample_weights:

Everything gets normalized to a single sample-wise (or timestep-wise) weight array. If both sample_weights and class_weights are provided, the weights are multiplied together.

119

answered Sep 17 '22 10:09

DarkCygnus

To add a little to DarkCygnus answer, for those who actually need to use class weight & sample weights simultaneously:
Here is a code, that I use for generating sample weights for classifying multiclass temporal data in sequences:
(targets is an array of dimension [#temporal, #categories] with values being in set(#classes), class_weights is an array of [#categories, #classes]).
The generated sequence has the same length as the targets array and the common usecase in batching is to pad the targets with zeros and the sample weights also up to the same size, thus making the network ignore the padded data.

def multiclass_temoral_class_weights(targets, class_weights):
    s_weights = np.ones((targets.shape[0],))
    # if we are counting the classes, the weights do not exist yet!
    if class_weights is not None:
        for i in range(len(s_weights)):
            weight = 0.0
            for itarget, target in enumerate(targets[i]):
                weight += class_weights[itarget][int(round(target))]
            s_weights[i] = weight
    return s_weights

answered Sep 19 '22 10:09

Holi

Related questions
                            
                                How to set Tensorflow dynamic_rnn, zero_state without a fixed batch_size?
                            
                                How to dynamically freeze weights after compiling model in Keras?
                            
                                Split List By Value and Keep Separators
                            
                                XGBoostError: b'[19:12:58] src/metric/rank_metric.cc:89: Check failed: (preds.size()) == (info.labels.size()) label size predict size not match'
                            
                                Difference in buffering of stdout on Linux and Windows
                            
                                How to get the index of filtered item in list using lambda?
                            
                                How to create a confirmation popup for class.DeleteView
                            
                                Splitting a dataframe into separate CSV files
                            
                                Trouble converting string to float in python
                            
                                Create a pandas dataframe from a nested lists of unequal lengths
                            
                                Add a validator to a Mongodb collection with pymongo
                            
                                Merge rows within a group together
                            
                                Convert string to float pandas
                            
                                Correlation between two non-numeric columns in a Pandas DataFrame
                            
                                How to flatten an xarray dataset into a 1D numpy array?
                            
                                insert missing category for each group in pandas dataframe
                            
                                How to pass the parameter to User-Defined Function?
                            
                                Add a vertical label to matplotlib colormap legend
                            
                                Bash Script to Conda Install requirements.txt with PIP follow-up
                            
                                Django restrict data that can be given to model field

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use both sample_weight and class_weight simultaneously

Tags:

python

tensorflow

keras

user7867665

People also ask

2 Answers

DarkCygnus

Holi

Recent Activity

Donate For Us