Replacing values in specific columns in a Pandas Dataframe, when number of columns are unknown

Question

I am brand new to Python and stacks exchange. I have been trying to replace invalid values ( x<-3 and x>12) with np.nan in specific columns.

I don't know how many columns I will have to deal with and thus will have to create a general code that takes this into account. I do however know, that the first two columns are ids and names respectively. I have searched google and stacks exchange for a solution but haven't been able to find a solution that solves my specific objective.

My question is; How would one replace values found in the third column and onwards?

My dataframe looks like this;

Data

I tried this line:

Data[Data > 12.0] = np.nan.

this replaced the first two columns with nan

1st attempt

I tried this line:

Data[(Data.iloc[(range(2,Columns))] >=12) & (Data.iloc[(range(2,Columns))]<=-3)] = np.nan

where,

Columns = len(Data.columns)

This is clearly wrong replacing all values in rows 2 to 6 (Columns = 7).

2nd attempt

Any thoughts would be greatly appreciated.

Python 3.6.1 64bits, Qt 5.6.2, PyQt5 5.6 on Darwin

economy · Accepted Answer

You're looking for the applymap() method.

import pandas as pd
import numpy as np

# get the columns after the second one
cols = Data.columns[2:]

# apply mask to those columns
new_df = Data[cols].applymap(lambda x: np.nan if x > 12 or x <= -3 else x)

Documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.applymap.html

This approach assumes your columns after the second contain float or int values.

Replacing values in specific columns in a Pandas Dataframe, when number of columns are unknown

Tags:

python-3.x

pandas

dataframe

J.Doe

1 Answers

economy

Recent Activity

Donate For Us

Replacing values in specific columns in a Pandas Dataframe, when number of columns are unknown

Tags:

python-3.x

pandas

dataframe

J.Doe

1 Answers

economy

Related questions

Recent Activity

Donate For Us