I've input an input file which I need to process and discard all the white-spaces, including non-breaking space U+00A0
aka  
(You can produce it in Notepad by pressing Alt and then typing 0 1 6 0 from the keyboard's numeric pad.) or any other form of white space. I have tried String.trim()
but it doesn't trim U+00A0
.
Do I need to explicitly check for U+00A0
and then trim()
or is there an easy way to trim all kinds of white-spaces in Java?
The trim() method normally trims chars in the range 0x00-0x20, and we just added one additional character to the character class. You could also make a faster version (probably) by taking the source code for the trim() method, and modifying it to trim 00 as well as the usual range.
A non-breaking character has value 160 in the 7-bit ASCII system, so you can define it by using the CHAR(160) formula. The SUBSTITUTE function is used to turn non-breaking spaces into regular spaces. And finally, you embed the SUBSTITUTE statement into the TRIM function to remove the converted spaces.
To remove leading and trailing spaces in Java, use the trim() method. This method returns a copy of this string with leading and trailing white space removed, or this string if it has no leading or trailing white space.
A character is a Java whitespace character if and only if it satisfies one of the following criteria: It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F'). It is '\t', U+0009 HORIZONTAL TABULATION.
While  
is a non breaking space (a space that does not want to be treated as whitespace), you can trim a string while preserving every  
within the string with a simple regex:
string.replaceAll("(^\\h*)|(\\h*$)","")
\h
is a horizontal whitespace character: [ \t\xA0\u1680\u180e\u2000-\u200a\u202f\u205f\u3000]
If you are using a pre JDK8 Version, you need to explicitly use the list of chars instead of \h
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With