Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas replace multiple values at once

Tags:

python

pandas

I'm attempting to clean up some of the Data that I have from an excel file. The file contains 7400 rows and 18 columns, which includes a list of customers with their respective addresses and other data. The problem that I'm encountering is that some of the cities are misspelled which distorts the information and makes it difficult for further processing.

  SURNAME   | ADDRESS          | CITY
0 Jenson    | 252 Des Chênes   | D.DO
1 Jean      | 236 Gouin        | DOLLARD
2 Denis     | 993 Boul. Gouin  | DOLLARD-DES-ORMEAUX
3 Bradford  | 1690 Dollard #7  | DDO
4 Alisson   | 115 Du Buisson   | IL PERROT
5 Abdul     | 9877 Boul. Gouin | Pierrefonds
6 O'Neil    | 5 Du College     | Ile Bizard
7 Bundy     | 7345 Sherbrooke  | ILLE Perot
8 Darcy     | 8671 Anthony #2  | ILE Perrot
9 Adams     | 845 Georges      | Pierrefonds

In the above example D.DO, DOLLARD, DDO should be spelled DOLLARD-DES-ORMEAUX and IL PERROT, ILLE PEROT, ILE PERROT should be spelled ILE-PERROT.

I've been able to replace the values using:

df["CITY"].replace(to_replace={"D.DO", "DOLLARD", "DDO"}, value="DOLLARD-DES-ORMEAUX", regex=True) 
df["CITY"].replace(to_replace={"IL PERROT", "ILLE PEROT", "ILE PERROT"}, value="ILE-PERROT", regex=True) 

Is there some way of combining the above operations into one? I've tried:

df["CITY"].replace({to_replace={"D.DO", "DOLLARD", "DDO"}, value="DOLLARD-DES-ORMEAUX", to_replace={"IL PERROT", "ILLE PEROT", "ILE PERROT"}, value="ILE-PERROT"}, regex=True) 

but I've had no luck

like image 506
Lukasz Avatar asked Mar 17 '16 22:03

Lukasz


People also ask

How do you replace multiple values in a column in Python?

Pandas replace multiple values in column replace. By using DataFrame. replace() method we will replace multiple values with multiple new strings or text for an individual DataFrame column. This method searches the entire Pandas DataFrame and replaces every specified value.

How do you replace multiple items in python?

Method 3: Replace multiple characters using re.subn() is similar to sub() in all ways, except in its way of providing output. It returns a tuple with a count of the total of replacement and the new string rather than just the string.

How replace values in column based on multiple conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.


2 Answers

try .replace({}, regex=True) method:

replacements = {
   'CITY': {
      r'(D.*DO|DOLLARD.*)': 'DOLLARD-DES-ORMEAUX',
      r'I[lL]*[eE]*.*': 'ILLE Perot'}
}

df.replace(replacements, regex=True, inplace=True)

print(df)

Output:

    SURNAME           ADDRESS                 CITY
0    Jenson    252 Des Chênes  DOLLARD-DES-ORMEAUX
1      Jean         236 Gouin  DOLLARD-DES-ORMEAUX
2     Denis   993 Boul. Gouin  DOLLARD-DES-ORMEAUX
3  Bradford   1690 Dollard #7  DOLLARD-DES-ORMEAUX
4   Alisson    115 Du Buisson           ILLE Perot
5     Abdul  9877 Boul. Gouin          Pierrefonds
6    O'Neil      5 Du College           ILLE Perot
7     Bundy   7345 Sherbrooke           ILLE Perot
8     Darcy   8671 Anthony #2           ILLE Perot
9     Adams       845 Georges          Pierrefonds
like image 172
MaxU - stop WAR against UA Avatar answered Oct 24 '22 10:10

MaxU - stop WAR against UA


You can create a dictionary of replacements and then iterate through them, using 'loc' for replacement.

target_for_values = {
    'DOLLARD-DES-ORMEAUX': ['D.DO', 'DOLLARD', 'DDO'], 
    'ILE-PERROT': ['IL PERROT', 'ILLE PEROT', 'ILE PERROT']}

for k, v in target_for_values.iteritems():
    df.loc[df.CITY.str.upper().isin(v), 'CITY'] = k

>>> df.CITY
                  CITY
0                 C.DO
1  DOLLARD-DES-ORMEAUX
2  DOLLARD-DES-ORMEAUX
3  DOLLARD-DES-ORMEAUX
4           ILE-PERROT
5          Pierrefonds
6           Ile Bizard
7           ILE-PERROT
8           ILE-PERROT
9          Pierrefonds
like image 31
Alexander Avatar answered Oct 24 '22 09:10

Alexander