Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use a conditional statement based on DataFrame boolean value in pandas

Tags:

python

pandas

Now I know how to check the dataframe for specific values across multiple columns. However, I cant seem to work out how to carry out an if statement based on a boolean response.

For example:

Walk directories using os.walk and read in a specific file into a dataframe.

for root, dirs, files in os.walk(main):
        filters = '*specificfile.csv'
        for filename in fnmatch.filter(files, filters):
        df = pd.read_csv(os.path.join(root, filename),error_bad_lines=False)

Now checking that dataframe across multiple columns. The first value being the column name (column1), the next value is the specific value I am looking for in that column(banana). I am then checking another column (column2) for a specific value (green). If both of these are true I want to carry out a specific task. However if it is false I want to do something else.

so something like:

if (df['column1']=='banana') & (df['colour']=='green'):
    do something
else: 
    do something
like image 830
iNoob Avatar asked Sep 22 '15 09:09

iNoob


People also ask

How to create conditional columns in pandas?

In this tutorial, we will go through several ways in which you create Pandas conditional columns. Pandas’ loc can create a boolean mask, based on condition. It can either just be selecting rows and columns, or it can be used to filter dataframes. column_name2 is the column to create or change, it could be the same as column_name1

How to apply an IF condition in pandas Dataframe?

Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’

How do you filter DataFrames in pandas?

Pandas’ loc creates a boolean mask, based on a condition. Sometimes, that condition can just be selecting rows and columns, but it can also be used to filter dataframes. These filtered dataframes can then have values applied to them.

How do I apply a label to another column in pandas?

Using Pandas Map to Set Values in Another Column The Pandas.map () method is very helpful when you’re applying labels to another column. In order to use this method, you define a dictionary to apply to the column. For our sample dataframe, let’s imagine that we have offices in America, Canada, and France.


1 Answers

If you want to check if any row of the DataFrame meets your conditions you can use .any() along with your condition . Example -

if ((df['column1']=='banana') & (df['colour']=='green')).any():

Example -

In [16]: df
Out[16]:
   A  B
0  1  2
1  3  4
2  5  6

In [17]: ((df['A']==1) & (df['B'] == 2)).any()
Out[17]: True

This is because your condition - ((df['column1']=='banana') & (df['colour']=='green')) - returns a Series of True/False values.

This is because in pandas when you compare a series against a scalar value, it returns the result of comparing each row of that series against the scalar value and the result is a series of True/False values indicating the result of comparison of that row with the scalar value. Example -

In [19]: (df['A']==1)
Out[19]:
0     True
1    False
2    False
Name: A, dtype: bool

In [20]: (df['B'] == 2)
Out[20]:
0     True
1    False
2    False
Name: B, dtype: bool

And the & does row-wise and for the two series. Example -

In [18]: ((df['A']==1) & (df['B'] == 2))
Out[18]:
0     True
1    False
2    False
dtype: bool

Now to check if any of the values from this series is True, you can use .any() , to check if all the values in the series are True, you can use .all() .

like image 146
Anand S Kumar Avatar answered Oct 19 '22 03:10

Anand S Kumar