Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove Multiple Blanks In DataFrame

Tags:

python

pandas

How do I remove multiple spaces between two strings in python.

e.g:-

"Bertug 'here multiple blanks' Mete" => "Bertug        Mete"

to

"Bertug Mete" 

Input is read from an .xls file. I have tried using split() but it doesn't seem to work as expected.

import pandas as pd , string , re

dataFrame = pd.read_excel("C:\\Users\\Bertug\\Desktop\\example.xlsx")

#names1 =  ''.join(dataFrame.Name.to_string().split()) 

print(type(dataFrame.Name))

#print(dataFrame.Name.str.split())

Let me know where I'm doing wrong.

like image 265
Bertug Avatar asked Mar 28 '17 13:03

Bertug


People also ask

How do you remove extra spaces from a Dataframe in Python?

Series. str. strip()” to remove the whitespace from the string. Using strip function we can easily remove extra whitespace from leading and trailing whitespace from staring.

How do you remove multiple spaces from text in Python?

We can remove multiple spaces from a string with a single space by using the re. sub() method in Python which is imported from the re library.

How do you get rid of white space in pandas?

lstrip() is used to remove spaces from the left side of string, str. rstrip() to remove spaces from right side of the string and str. strip() removes spaces from both sides. Since these are pandas function with same name as Python's default functions, .


1 Answers

I think use replace:

df.Name = df.Name.replace(r'\s+', ' ', regex=True)

Sample:

df = pd.DataFrame({'Name':['Bertug     Mete','a','Joe    Black']})
print (df)
              Name
0  Bertug     Mete
1                a
2     Joe    Black

df.Name = df.Name.replace(r'\s+', ' ', regex=True)
#similar solution
#df.Name = df.Name.str.replace(r'\s+', ' ')
print (df)
          Name
0  Bertug Mete
1            a
2    Joe Black
like image 138
jezrael Avatar answered Oct 12 '22 09:10

jezrael