I have a df where I need to flag a column as '1' if the 'row in my .apply() matches an item my dictionary. However, if my dictionary is empty or doesn't include the same 'Key' as the row which is in my .apply() at that instance, the script fails. How can I progress through this hickup?
df = pd.DataFrame({'Key': ['10003', '10003', '10003', '10003', '10003','10003','10034'],
'Num1': [12,13,30,12,13,13,16],
'Num2': [121,122,122,124,125,126,127],
'admit': [2015019, 20150124, 20150206,20150211,20150220,20150407,20150211],
'discharge': [20150123, 20150202, 20150211,20150220,20150304,20150410,20150211]})
df['admit'] = pd.to_datetime(df['admit'], format='%Y%m%d')
df['discharge'] = pd.to_datetime(df['discharge'], format='%Y%m%d')
#df=df.head(5)
script:
d2 = df[df['discharge'].isin(range(30,40))].groupby('Key')['discharge'].apply(set).to_dict()
def find(x):
match2 = x['admit'] in d2[x['Key']]
return match2
df['flag'] = df.apply(find, axis=1).astype(int)
In particular, I need to flag a column where the admit date of one row is equal to the discharge date of another AND the row with the matching discharge date has a value in Num1 between 30-40. This script works as expected if you reduce the df to just 5 rows df=df.head(5). But when there are rows where their 'Key' is not in the dictionary, the script returns an error. I am wondering if adding all 'key' and blank dates for the dictionary would make this work?
KeyError: ('10034', 'occurred at index 6')
I want to use dictionaries to perform the task as described above because the rest of my function has similar conditions as this (those were more simple). And the code above works as it should on a small sample, but my dictionary experience is low and this is stumping me. Sorry if this is a simple and stupid question.
final df:
Key Num1 Num2 admit discharge flag
0 10003 12 121 2015-01-09 2015-01-23 0
1 10003 13 122 2015-01-24 2015-02-02 0
2 10003 30 122 2015-02-06 2015-02-11 0
3 10003 12 124 2015-02-11 2015-02-20 1
4 10003 13 125 2015-02-20 2015-03-04 0
5 10003 13 126 2015-04-07 2015-04-10 0
6 10034 16 127 2015-02-11 2015-02-11 0
You can use the dict.get and return a empty list.
Ex:
def find(x):
match2 = x['admit'] in d2.get(x['Key'], [])
return match2
Use a try / except clause to catch KeyError and specify what to return in this situation:
def find(x):
try:
return x['admit'] in d2[x['Key']]
except KeyError:
return False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With