I am brand new to Python and stacks exchange. I have been trying to replace invalid values ( x<-3 and x>12) with np.nan in specific columns.
I don't know how many columns I will have to deal with and thus will have to create a general code that takes this into account. I do however know, that the first two columns are ids and names respectively. I have searched google and stacks exchange for a solution but haven't been able to find a solution that solves my specific objective.
My question is; How would one replace values found in the third column and onwards?
My dataframe looks like this;
Data
I tried this line:
Data[Data > 12.0] = np.nan.
this replaced the first two columns with nan
1st attempt
I tried this line:
Data[(Data.iloc[(range(2,Columns))] >=12) & (Data.iloc[(range(2,Columns))]<=-3)] = np.nan
where,
Columns = len(Data.columns)
This is clearly wrong replacing all values in rows 2 to 6 (Columns = 7).
2nd attempt
Any thoughts would be greatly appreciated.
Python 3.6.1 64bits, Qt 5.6.2, PyQt5 5.6 on Darwin
You're looking for the applymap() method.
import pandas as pd
import numpy as np
# get the columns after the second one
cols = Data.columns[2:]
# apply mask to those columns
new_df = Data[cols].applymap(lambda x: np.nan if x > 12 or x <= -3 else x)
Documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.applymap.html
This approach assumes your columns after the second contain float or int values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With