Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace certain value in pandas Dataframe without knowing neither column nor row

I would like to replace a value in my Pandas dataframe in Python. (replace float with string). I know the value itself, but not the column nor the row and want to run it afterwards with different inputs. I have the following dataframe:

     P1899       P3486      P4074      P3352       P3500      P3447
Time                                                                
1997  100.0   89.745739  85.198939  87.377584  114.755270  81.131599
1998  100.0  101.597557  83.468442  86.369083  106.031629  95.263796
1999  100.0   97.234551  91.262551  88.759609  104.539337  95.859980
2000  100.0  100.759918  74.236098  88.295711  103.739557  90.272329
2001  100.0   96.873469  86.075067  87.530995  106.371072  91.807542
2002  100.0   95.000000  90.313561  82.699342  109.279845  94.444444

now I want to replace values larger than 110 with 'OVER' and smaller than 90 with 'UNDER'. I used the following, since I couldn't get any results with a for loop. I used lambda:

annual_rainfall_perc = annual_rainfall_perc.apply(lambda x: np.where(x > 110, 2000, x))
annual_rainfall_perc = annual_rainfall_perc.apply(lambda x: np.where(x < 90, 'UNDER', round(x, 2)))

Here I replaced all bigger values with 2000, because otherwise the second lambda won't be able to check a dataframe containing floats and strings... My dataframe now looks like the following:

     P1899   P3486  P4074  P3352   P3500  P3447
Time                                            
1997  100.0   Under  Under  Under  2000.0  Under
1998  100.0   101.6  Under  Under  106.03  95.26
1999  100.0   97.23  91.26  Under  104.54  95.86
2000  100.0  100.76  Under  Under  103.74  90.27
2001  100.0   96.87  Under  Under  106.37  91.81
2002  100.0    95.0  90.31  Under  109.28  94.44

So now I was planning to replace all values equal to 2000 with 'OVER'. How do I do that?

I tried:

for x in annual_rainfall_perc:
    for i in x:
        if i == 2000:
            annual_rainfall_perc[x][i]= 'Over'
        else:
            annual_rainfall_perc=annual_rainfall_perc
print(annual_rainfall_perc)

but nothing in the dataframe changed. Is there another way to this?

like image 913
ma2020 Avatar asked Jun 22 '20 13:06

ma2020


People also ask

How can I replace multiple values with one value in pandas?

To replace multiple values in a DataFrame we can apply the method DataFrame. replace(). In Pandas DataFrame replace method is used to replace values within a dataframe object.


2 Answers

Very simple using mask:

df.mask(df>110,'OVER').mask(df<90,'UNDER')

Result:

      P1899    P3486    P4074  P3352    P3500    P3447
Time                                                  
1997    100    UNDER    UNDER  UNDER     OVER    UNDER
1998    100  101.598    UNDER  UNDER  106.032  95.2638
1999    100  97.2346  91.2626  UNDER  104.539    95.86
2000    100   100.76    UNDER  UNDER   103.74  90.2723
2001    100  96.8735    UNDER  UNDER  106.371  91.8075
2002    100       95  90.3136  UNDER   109.28  94.4444
like image 199
Stef Avatar answered Sep 30 '22 18:09

Stef


Here's a way to do it in a vectorized manner. Do all the strings operations in a separate data frame, and then assign the relevant values in one go:

new_df = df.copy()

new_df.loc[:, :] = " "
new_df[df > 110] = "over"
new_df[df < 90] = "under"

df[(df < 90) | (df > 110)] = new_df

The result:

      P1899    P3486    P4074  P3352    P3500    P3447
Time                                                  
1997  100.0    under    under  under     over    under
1998  100.0  101.598    under  under  106.032  95.2638
1999  100.0  97.2346  91.2626  under  104.539    95.86
2000  100.0   100.76    under  under   103.74  90.2723
2001  100.0  96.8735    under  under  106.371  91.8075
2002  100.0       95  90.3136  under   109.28  94.4444
like image 25
Roy2012 Avatar answered Sep 30 '22 17:09

Roy2012