I have created a sparse matrix dataframe which has taken the values in a list and set them as column headers. A number of rows contain headers for example "000 bank". I want to remove the "000 " so it is just 'bank' for example.
000 bank 000 claim 000 confirmed 000 debit 000 delete 000 frequent 000 hashed ...
0 0.000000 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.052024 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 kddi
2 0.000000 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 e
3 0.000000 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2
Index(['000', '000 000', '000 3rd', '000 bank', '000 claim', '000 confirmed',
'000 debit', '000 delete', '000 frequent', '000 hashed',
...
'years multiple', 'yet', 'yet confirm', 'yet evidence', 'yet expired',
'yet many', 'yet published', 'zarefarid', 'zarefarid wrote', 'Keyword'],
dtype='object', length=3831)
How can I get rid of the '000 '. Not all column headers have the 000 in them as you can see in the index above.
Use Series.str.replace with ^ for start of string:
df.columns = df.columns.str.replace('^000 ','')
Sample:
df = pd.DataFrame(columns=['000', '000 000', '000 3rd', '000 bank',
'000 claim', '000 confirmed'])
print (df)
Empty DataFrame
Columns: [000, 000 000, 000 3rd, 000 bank, 000 claim, 000 confirmed]
Index: []
df.columns = df.columns.str.replace('^000 ','')
print (df)
Empty DataFrame
Columns: [000, 000, 3rd, bank, claim, confirmed]
Index: []
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With