The JDK's String.trim() method is pretty naive, and only removes ascii control characters.
Apache Commons' StringUtils.strip() is slightly better, but uses the JDK's Character.isWhitespace(), which doesn't recognize non-breaking space as whitespace.
So what would be the most complete, Unicode-compatible, safe and proper way to trim a string in Java?
And incidentally, is there a better library than commons-lang
that I should be using for this sort of stuff?
Use the String. replace() method to remove all whitespace from a string, e.g. str. replace(/\s/g, '') . The replace() method will remove all whitespace characters by replacing them with an empty string.
The trim() method in Java String is a built-in function that eliminates leading and trailing spaces. The Unicode value of space character is '\u0020'. The trim() method in java checks this Unicode value before and after the string, if it exists then removes the spaces and returns the omitted string.
To remove leading and trailing spaces in Java, use the trim() method. This method returns a copy of this string with leading and trailing white space removed, or this string if it has no leading or trailing white space.
Google has made guava-libraries available recently. It may have what you are looking for:
CharMatcher.inRange('\0', ' ').trimFrom(str)
is equivalent to String.trim(), but you can customize what to trim, refer to the JavaDoc.
For instance, it has its own definition of WHITESPACE which differs from the JDK and is defined according to the latest Unicode standard, so what you need can be written as:
CharMatcher.WHITESPACE.trimFrom(str)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With