Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split strings inside a list by whitespace characters

So stdin returns a string of text into a list, and multiple lines of text are all list elements. How do you split them all into single words?

mylist = ['this is a string of text \n', 'this is a different string of text \n', 'and for good measure here is another one \n']

wanted output:

newlist = ['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']
like image 841
iFunction Avatar asked May 20 '17 12:05

iFunction


People also ask

How do you split a string with white space characters?

You can split a String by whitespaces or tabs in Java by using the split() method of java. lang. String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces.

How do I split a string in a list?

Python String split() Method The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How do you split a string by spaces?

To split a string with space as delimiter in Java, call split() method on the string object, with space " " passed as argument to the split() method. The method returns a String Array with the splits as elements in the array.

How to split a string on whitespace characters in Java?

There are several ways to split a string on whitespace characters: 1. Using String.split () method The standard solution to split a string is using the split () method provided by the String class. It accepts a regular expression as a delimiter and returns a string array.

How do you split a string into two strings?

The standard solution to split a string is using the split () method provided by the String class. It accepts a regular expression as a delimiter and returns a string array. To split on any whitespace character, you can use the predefined character class \s that represents a whitespace character.

How many whitespace characters are there in a string?

Here, the only two whitespace characters are the two spaces. As a result, splitting this string by whitespace would result in a list of three strings:

Which list would result in a line break split by whitespace?

With the addition of the line break, we would expect that splitting by whitespace would result in the following list: ["Hi,", "Ben!", "How", "are", "you?"] ["Hi,", "Ben!", "How", "are", "you?"]


4 Answers

You can use simple list comprehension, like:

newlist = [word for line in mylist for word in line.split()]

This generates:

>>> [word for line in mylist for word in line.split()]
['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']
like image 136
Willem Van Onsem Avatar answered Nov 29 '22 03:11

Willem Van Onsem


You could just do:

words = str(list).split()

So you turn the list into a string then split it by a space bar. Then you can remove the /n's by doing:

words.replace("/n", "")

Or if you want to do it in one line:

words = str(str(str(list).split()).replace("/n", "")).split()

Just saying this may not work in python 2

like image 22
SollyBunny Avatar answered Nov 29 '22 01:11

SollyBunny


Alternatively, you can map str.split method to every string inside the list and then chain the elements from the resulting lists together by itertools.chain.from_iterable:

from itertools import chain

mylist = ['this is a string of text \n', 'this is a different string of text \n', 'and for good measure here is another one \n']
result = list(chain.from_iterable(map(str.split, mylist)))
print(result)
# ['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']
like image 27
Georgy Avatar answered Nov 29 '22 01:11

Georgy


Besides the list comprehension answer above that i vouch for, you could also do it in a for loop:

#Define the newlist as an empty list
newlist = list()
#Iterate over mylist items
for item in mylist:
 #split the element string into a list of words
 itemWords = item.split()
 #extend newlist to include all itemWords
 newlist.extend(itemWords)
print(newlist)

eventually your newlist will contain all split words that were in all elements in mylist

But the python list comprehension looks much nicer and you can do awesome things with it. Check here for more:

https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

like image 33
Ouss Avatar answered Nov 29 '22 01:11

Ouss