I'm trying to create a column in my data set such that any null values can be set to 0, and non-null values are set to 1. For starters, my column of data called '9Age', roughly speaking, looks like this:
NaN
6
5
NaN
2
NaN
3
5
4
Setting null values to 0 can be as easy as doing this:
Age0 = df['9Age'].fillna(0)
However, here's the rest of my attempt: Deciding whether a value is null or not was done below:
Age1 = df['9Age'].notnull()
This changes '9Age' to:
False
True
True
False
True
False
True
True
True
That is, it returns True if the observation is not null, but False if it is. Following this logic, the next step I took was writing this:
AgeExist = Age1.map({'False':0, 'True': 1})
However, to my dismay, AgeExist yields
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
Or, a bunch of null values. Where did I go wrong, and what would be a better way to approach all of this?
Unless I'm wildly mistaken, it's a simple matter of True
is not 'True'
.
AgeExist = Age1.map({False:0, True: 1})
Should work for you.
You can convert a Series of True/False values to their integer representations using .astype
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['col'] = [np.NaN, 6, 5, np.NaN]
col = df['col'].notnull()
col.astype(int)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With