I have a test.txt file containing several lines for example, such as:
"h3llo, @my name is, bob! (how are you?)"
"i am fine@@@@@"
I want to split all the alphanumeric characters and the new line into an arraylist so the output would be
output = ["h", "llo", "my", "name", "is", "bob", "how", "are", "you", "i", "am", "fine"]
Right now, I tried splitting my text with
output.split("\\P{Alpha}+")
But for some reason this seems to add a comma in the first spot in the arraylist, and replaces the newline with an empty string
output = ["", "h", "llo", "my", "name", "is", "bob", "how", "are", "you", "", "i", "am", "fine"]
Is there another way to fix this? Thank you!
--
EDIT: How can I make sure it ignores the new line?
Split a string at a newline character. When the literal \n represents a newline character, convert it to an actual newline using the compose function. Then use splitlines to split the string at the newline character. Create a string in which two lines of text are separated by \n .
split("\\s+") will split the string into string of array with separator as space or multiple spaces. \s+ is a regular expression for one or more spaces.
Java's String.split()
behavior is pretty confusing. A much better splitting utility is Guava's Splitter
. Their documentation goes into more detail about the problems with String.split()
:
The built in Java utilities for splitting strings can have some quirky behaviors. For example,
String.split
silently discards trailing separators, andStringTokenizer
respects exactly five whitespace characters and nothing else.Quiz:
",a,,b,".split(",")
returns...
"", "a", "", "b", ""
null, "a", null, "b", null
"a", null, "b"
"a", "b"
- None of the above
The correct answer is none of the above:
"", "a", "", "b"
. Only trailing empty strings are skipped. What is this I don't even.
In your case this should work:
Splitter.onPattern("\\P{Alpha}+").omitEmptyStrings().splitToList(output);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With