I have a dataset with various columns as below:
discount   tax   total   subtotal  productid
 3.98      1.06   21.06      20      3232 
 3.98      1.06   21.06      20      3232 
 3.98       6     106        100     3498 
 3.98       6     106        100     3743 
 3.98       6     106        100     3350 
 3.98       6     106        100     3370 
 46.49     3.36   66.84      63       695
Now, I need to add a new column Class and assign it the value of 0 or 1 on the base of the following conditions:
if:
    discount > 20%
    no tax
    total > 100
then the Class will 1
otherwise it should be 0
I have done it with a single condition but I don't how can I accomplish it under multiple conditions.
Here's wIat i have tried:
df_full['Class'] = df_full['amount'].map(lambda x: 1 if x > 100 else 0)
I have taken a look at all other similar questions but couldn't find any solution for my problem.I have tried all of the above-mentioned posts but stuck on this error:
TypeError: '>' not supported between instances of 'str' and 'int'
Here's in the case of first posted answer, i have tried it as:
df_full['class'] = np.where( ( (df_full['discount'] > 20) & (df_full['tax'] == 0 ) & (df_full['total'] > 100) & df_full['productdiscount'] ) , 1, 0)
                Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.
Method 1 : Using dataframe. With this method, we can access a group of rows or columns with a condition or a boolean array. If we can access it we can also manipulate the values, Yes! this is our first method by the dataframe. loc[] function in pandas we can access a column and change its values with a condition.
Using Loc to Filter With Multiple Conditions The loc function in pandas can be used to access groups of rows or columns by label. Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd. dataframe of our filtered rows.
You can apply an arbitrary function across a dataframe row using DataFrame.apply.
In your case, you could define a function like:
def conditions(s):
    if (s['discount'] > 20) or (s['tax'] == 0) or (s['total'] > 100):
        return 1
    else:
        return 0
And use it to add a new column to your data:
df_full['Class'] = df_full.apply(conditions, axis=1)
                        Judging by the image of your data is rather unclear what you mean by a discount 20%.
However, you can likely do something like this.
df['class'] = 0 # add a class column with 0 as default value
# find all rows that fulfills your conditions and set class to 1
df.loc[(df['discount'] / df['total'] > .2) & # if discount is more than .2 of total 
       (df['tax'] == 0) & # if tax is 0
       (df['total'] > 100), # if total is > 100 
       'class'] = 1 # then set class to 1
Note that & means and here, if you want or instead use |.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With