Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove prefix in all column names

Tags:

python

pandas

I would like to remove the prefix from all column names in a dataframe.

I tried creating a udf and calling it in a for loop

def remove_prefix(str, prefix):
    if str.startswith(blabla):
        return str[len(prefix):]
    return str

for x in df.columns:
    x.remove_prefix()
like image 806
r_me Avatar asked Apr 24 '19 12:04

r_me


People also ask

How do I remove a prefix from a DataFrame column?

To remove prefix from column labels in Pandas DataFrame, use the str. lstrip(~) method.

How do you remove a common prefix from column names in R?

You can use sub in base R to remove "m" from the beginning of the column names.

How do I remove columns with the same name?

drop_duplicates(). T you can drop/remove/delete duplicate columns with the same name or a different name. This method removes all columns of the same name beside the first occurrence of the column also removes columns that have the same data with the different column name.


1 Answers

Use Series.str.replace with regex ^ for match start of string:

df = pd.DataFrame(columns=['pre_A', 'pre_B', 'pre_predmet'])
df.columns = df.columns.str.replace('^pre_', '')
print (df)
Empty DataFrame
Columns: [A, B, predmet]
Index: []

Another solution is use list comprehension with re.sub:

import re

df.columns = [re.sub('^pre_',"", x) for x in df.columns]
like image 128
jezrael Avatar answered Sep 19 '22 20:09

jezrael