I am trying to get the contents of a text file, remove everything other than alphabets and then convert it into an array of Strings for individual processing of words.
I do this for getting the text file :
String temp1= IOUtils.toString(FIS,"UTF-8");
String temp2=temp1.replaceAll("[,.!;:\\r\\n]"," ");
And then to tokenize the string, I do this:
String[] tempStringArray = temp2.split(" ");
The problem is that when the array is created, there are empty String at various indices.These empty String are at the position of linebreak, more than one whitespace, replaced punctuation marks, etc in the text file.
I want these empty Strings to be removed from my String array or in a way which they are unable to enter the String array.
How can this be done?
Split by all whitespaces like: String[] tempStringArray = temp2.split("\\s+")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With