Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling NaN Values in Pandas with Conditional Statement

Tags:

python

pandas

I'm working with some data where the customer postcode data is invalid. As a result I'm not able to map the CountryISOCode to their postcode resulting in a NaN. However, I have noticed that for all CountryISOCodes with NaN, the CurrencyCode can provide me with enough to fix the problem for now.

I've gone to various Stackoverflow articles but I cannot find the solution to my problem. I've tried...

def func(row):
    if row['CountryISOCode'] == np.nan & row['Currency'] == 'EUR':
        return 'IRE'
elif row['CountryISOCode'] == np.nan & row['Currency'] == 'GBP':
    return 'GBR'
else:
    return row['CountryISOCode']

df['CountryISOCode'] = df.apply(func, axis=1)

and some other methods but to no avail...

Below I have provided a replication of the data I'm working with

import pandas as pd
import numpy as np

data = [
    ['Steve', 'Invalid Postcode', 'GBP', np.nan ],
    ['Robyn', 'Invalid Postcode', 'EUR', np.nan],
    ['James', 'Valid Postcode', 'GBP', 'GBR'],
    ['Halo', 'Invalid Postcode', 'EUR', np.nan],
    ['Jesus', 'Valid Postcode', 'GBP', 'GBR']
    ]

df = pd.DataFrame(columns=["Name", "PostCode", "CurrencyCode", "CountryISOCode"], data=data)

Essentially if I was working with SQL my code would be as follows.

IF countryISOCode IS NULL 
    AND currency = ‘GBP’ 
THEN CountryISOCode =  ‘GBR’
ELSE
IF countryISOCode IS NULL 
    AND currency = ‘EUR 
THEN CountryISOCode =  ‘IRE’
ELSE countryISOCode 
END

Any ideas?

like image 461
Ryan Davies Avatar asked Jan 29 '19 15:01

Ryan Davies


People also ask

How do I change NaN values in pandas based on condition?

You can replace all values or selected values in a column of pandas DataFrame based on condition by using DataFrame. loc[] , np. where() and DataFrame. mask() methods.

How pandas handle DataFrame NaN values?

In order to check missing values in Pandas DataFrame, we use a function isnull() and notnull(). Both function help in checking whether a value is NaN or not. These function can also be used in Pandas Series in order to find null values in a series.

How to check for Nan in pandas Dataframe?

Here are 4 ways to check for NaN in Pandas DataFrame: (1) Check for NaN under a single DataFrame column: df ['your column name'].isnull ().values.any () (2) Count the NaN under a single DataFrame column:

How to apply an IF condition in pandas Dataframe?

Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’

How do I apply a label to another column in pandas?

Using Pandas Map to Set Values in Another Column The Pandas.map () method is very helpful when you’re applying labels to another column. In order to use this method, you define a dictionary to apply to the column. For our sample dataframe, let’s imagine that we have offices in America, Canada, and France.

How to replace NaN values with zeros in a Dataframe?

Here are the before and after results, where the ‘5’ became ‘555’ and the 0’s became ‘999’ under the existing ‘set_of_numbers’ column: On another instance, you may have a DataFrame that contains NaN values. You can then apply an IF condition to replace those values with zeros, as in the example below:


Video Answer


3 Answers

You can use fillna with a dictionary specifying mappings for when currency code is helpful:

cmap = {'GBP': 'GBR', 'EUR': 'IRE'}
df['CountryISOCode'] = df['CountryISOCode'].fillna(df['CurrencyCode'].map(cmap))

print(df)

    Name          PostCode CurrencyCode CountryISOCode
0  Steve  Invalid Postcode          GBP            GBR
1  Robyn  Invalid Postcode          EUR            IRE
2  James    Valid Postcode          GBP            GBR
3   Halo  Invalid Postcode          EUR            IRE
4  Jesus    Valid Postcode          GBP            GBR
like image 96
jpp Avatar answered Oct 11 '22 18:10

jpp


You could use np.select for this, which allows you to choose from a list depending on the result of a list of conditions:

m1 = df.CountryISOCode.isna()
m2 = df.CurrencyCode.eq('GBP')
m3 = df.CurrencyCode.eq('EUR')
df.loc[:,'CountryISOCode'] = np.select([m1&m2, m1&m3], ['GBP','IRE'], 
                                       default=df.CountryISOCode)

 Name          PostCode CurrencyCode CountryISOCode
0  Steve  Invalid Postcode          GBP            GBP
1  Robyn  Invalid Postcode          EUR            IRE
2  James    Valid Postcode          GBP            GBR
3   Halo  Invalid Postcode          EUR            IRE
4  Jesus    Valid Postcode          GBP            GBR
like image 33
yatu Avatar answered Oct 11 '22 18:10

yatu


I am adding this answer as it adds value to the original question. The reason the comparison statements weren't working is because np.nan == np.nan will not work. You can check for the identity of the NaN element but not equality. See in operator, float("NaN") and np.nan for more detail. With that said, this is how you can transform the original code to make it work as expected.

import pandas as pd                                                                                                                                    
import numpy as np

raw_data = [
    ['Steve', 'Invalid Postcode', 'GBP', np.nan ],
    ['Robyn', 'Invalid Postcode', 'EUR', np.nan],
    ['James', 'Valid Postcode', 'GBP', 'GBR'],
    ['Halo', 'Invalid Postcode', 'EUR', np.nan],
    ['Jesus', 'Valid Postcode', 'GBP', 'GBR']
    ]

df = pd.DataFrame(columns=["Name", "PostCode", "Currency", "CountryISOCode"], data=raw_data)

def func(row):
    if row['CountryISOCode'] is np.nan and row['Currency'] == 'EUR':
        return 'IRE'
    elif row['CountryISOCode'] is np.nan and row['Currency'] == 'GBP':
        return 'GBR'
    else:
        return row['CountryISOCode']

df['CountryISOCode'] = df.apply(func, axis=1)

print(df)

However, the other answers are great also.

like image 4
Rachel Avatar answered Oct 11 '22 18:10

Rachel