So stdin returns a string of text into a list, and multiple lines of text are all list elements. How do you split them all into single words?
mylist = ['this is a string of text \n', 'this is a different string of text \n', 'and for good measure here is another one \n']
wanted output:
newlist = ['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']
You can split a String by whitespaces or tabs in Java by using the split() method of java. lang. String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces.
Python String split() Method The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.
To split a string with space as delimiter in Java, call split() method on the string object, with space " " passed as argument to the split() method. The method returns a String Array with the splits as elements in the array.
There are several ways to split a string on whitespace characters: 1. Using String.split () method The standard solution to split a string is using the split () method provided by the String class. It accepts a regular expression as a delimiter and returns a string array.
The standard solution to split a string is using the split () method provided by the String class. It accepts a regular expression as a delimiter and returns a string array. To split on any whitespace character, you can use the predefined character class \s that represents a whitespace character.
Here, the only two whitespace characters are the two spaces. As a result, splitting this string by whitespace would result in a list of three strings:
With the addition of the line break, we would expect that splitting by whitespace would result in the following list: ["Hi,", "Ben!", "How", "are", "you?"] ["Hi,", "Ben!", "How", "are", "you?"]
You can use simple list comprehension, like:
newlist = [word for line in mylist for word in line.split()]
This generates:
>>> [word for line in mylist for word in line.split()]
['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']
You could just do:
words = str(list).split()
So you turn the list into a string then split it by a space bar. Then you can remove the /n's by doing:
words.replace("/n", "")
Or if you want to do it in one line:
words = str(str(str(list).split()).replace("/n", "")).split()
Just saying this may not work in python 2
Alternatively, you can map str.split
method to every string inside the list and then chain the elements from the resulting lists together by itertools.chain.from_iterable
:
from itertools import chain
mylist = ['this is a string of text \n', 'this is a different string of text \n', 'and for good measure here is another one \n']
result = list(chain.from_iterable(map(str.split, mylist)))
print(result)
# ['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']
Besides the list comprehension answer above that i vouch for, you could also do it in a for loop:
#Define the newlist as an empty list
newlist = list()
#Iterate over mylist items
for item in mylist:
#split the element string into a list of words
itemWords = item.split()
#extend newlist to include all itemWords
newlist.extend(itemWords)
print(newlist)
eventually your newlist
will contain all split words that were in all elements in mylist
But the python list comprehension looks much nicer and you can do awesome things with it. Check here for more:
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With