Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python split a string with at least 2 whitespaces

I would like to split a string only where there are at least two or more whitespaces.

For example

str = '10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1' print str.split() 

Results:

['10DEUTSCH', 'GGS', 'Neue', 'Heide', '25-27', 'Wahn-Heide', '-1', '-1'] 

I would like it to look like this:

['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1'] 
like image 780
Eagle Avatar asked Oct 12 '12 20:10

Eagle


People also ask

How do you split a string in Python with two conditions?

Python has a built-in method you can apply to string, called . split() , which allows you to split a string by a certain delimiter.

How do you split a string including whitespace in Python?

Python String split() MethodThe split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How do I split a string by any whitespace?

You can split a String by whitespaces or tabs in Java by using the split() method of java. lang. String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces.

How do you split more than one space in Python?

We used the str. split() method to split a string by one or more spaces. The str. split() method splits the string into a list of substrings using a delimiter.


2 Answers

In [4]: import re     In [5]: text = '10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1' In [7]: re.split(r'\s{2,}', text) Out[7]: ['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1'] 

Update 2021+ answer.

str.split now accepts regular expressions to split on.

read more here

row = '10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1' df = pd.DataFrame({'string' : row},index=[0]) 

print(df)                                               string 0  10DEUTSCH        GGS Neue Heide 25-27     Wahn... 

df1 = df['string'].str.split('\s{2,}',expand=True) print(df1)             0                     1           2   3   4 0  10DEUTSCH  GGS Neue Heide 25-27  Wahn-Heide  -1  -1 
like image 194
unutbu Avatar answered Oct 10 '22 06:10

unutbu


As has been pointed out, str is not a good name for your string, so using words instead:

output = [s.strip() for s in words.split('  ') if s] 

The .split(' ') -- with two spaces -- will give you a list that includes empty strings, and items with trailing/leading whitespace. The list comprehension iterates through that list, keeps any non-blank items (if s), and .strip() takes care of any leading/trailing whitespace.

like image 29
toxotes Avatar answered Oct 10 '22 05:10

toxotes