I have a dataframe in pandas, with columns named "string_string", I'm trying to rename them by removing the "_" and the following string. For example, I want to change "12527_AC9E5" to "12527". I've tried to use various replace options, and I can replace a specific part of the string (e.g., I can replace all the "_"), but when I introduce wildcards I do not achieve the desired result.
Below are some of the things I thought would work, but don't. If I remove the wild cards they work (i.e, they replace the _).
df = df.rename(columns=lambda x: x.sub('_.+', ''))
df.columns = df.columns.str.replace('_.+','')
Any help appreciated
Just split on '_' and take the first element. You can take advantage of dictionary comprehension:
df = df.rename(columns={col: col.split('_')[0] for col in df.columns})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With