Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace column values using regex in pandas data frame

I have a column in pandas data frame like below. Column name is ABC

ABC
Fuel
FUEL
Fuel_12_ab
Fuel_1
Lube
Lube_1
Lube_12_a
cat_Lube

Now I want to replace the values in this column using regex like below

ABC
Fuel
FUEL
Fuel
Fuel
Lube
Lube
Lube
cat_Lube

How can we do this type of string matching in pandas data frame.

like image 912
User12345 Avatar asked Oct 30 '17 21:10

User12345


2 Answers

In [63]: df.ABC.str.replace(r'_\d+.*', r'')
Out[63]:
0        Fuel
1        FUEL
2        Fuel
3        Fuel
4        Lube
5        Lube
6        Lube
7    cat_Lube
Name: ABC, dtype: object
like image 77
MaxU - stop WAR against UA Avatar answered Sep 20 '22 10:09

MaxU - stop WAR against UA


Use positive lookbehind for lube or fuel while ignoring case.

import re
import pandas as pd

pat = re.compile('(?<=lube|fuel)_', re.IGNORECASE)

df.assign(ABC=[re.split(pat, x, 1)[0] for x in df.ABC])

        ABC
0      Fuel
1      FUEL
2      Fuel
3      Fuel
4      Lube
5      Lube
6      Lube
7  cat_Lube
like image 32
piRSquared Avatar answered Sep 18 '22 10:09

piRSquared