I am trying to check if a Java String is not null
, not empty and not whitespace.
In my mind, this code should have been quite up for the job.
public static boolean isEmpty(String s) { if ((s != null) && (s.trim().length() > 0)) return false; else return true; }
As per documentation, String.trim()
should work thus:
Returns a copy of the string, with leading and trailing whitespace omitted.
If this
String
object represents an empty character sequence, or the first and last characters of character sequence represented by thisString
object both have codes greater than'\u0020'
(the space character), then a reference to thisString
object is returned.
However, apache/commons/lang/StringUtils.java
does it a little differently.
public static boolean isBlank(String str) { int strLen; if (str == null || (strLen = str.length()) == 0) { return true; } for (int i = 0; i < strLen; i++) { if ((Character.isWhitespace(str.charAt(i)) == false)) { return false; } } return true; }
As per documentation, Character.isWhitespace()
:
Determines if the specified character is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:
- It is a Unicode space character (
SPACE_SEPARATOR
,LINE_SEPARATOR
, orPARAGRAPH_SEPARATOR
) but is not also a non-breaking space ('\u00A0'
,'\u2007'
,'\u202F'
).- It is
'\t'
, U+0009 HORIZONTAL TABULATION.- It is
'\n'
, U+000A LINE FEED.- It is
'\u000B'
, U+000B VERTICAL TABULATION.- It is
'\f'
, U+000C FORM FEED.- It is
'\r'
, U+000D CARRIAGE RETURN.- It is
'\u001C'
, U+001C FILE SEPARATOR.- It is
'\u001D'
, U+001D GROUP SEPARATOR.- It is
'\u001E'
, U+001E RECORD SEPARATOR.- It is
'\u001F'
, U+001F UNIT SEPARATOR.
If I am not mistaken - or might be I am just not reading it correctly - the String.trim()
should take away any of the characters that are being checked by Character.isWhiteSpace()
. All of them see to be above '\u0020'
.
In this case, the simpler isEmpty
function seems to be covering all the scenarios that the lengthier isBlank
is covering.
isEmpty
and isBlank
behave differently in a test case?isBlank
and not use isEmpty
?For those interested in actually running a test, here are the methods and unit tests.
public class StringUtil { public static boolean isEmpty(String s) { if ((s != null) && (s.trim().length() > 0)) return false; else return true; } public static boolean isBlank(String str) { int strLen; if (str == null || (strLen = str.length()) == 0) { return true; } for (int i = 0; i < strLen; i++) { if ((Character.isWhitespace(str.charAt(i)) == false)) { return false; } } return true; } }
And unit tests
@Test public void test() { String s = null; assertTrue(StringUtil.isEmpty(s)) ; assertTrue(StringUtil.isBlank(s)) ; s = ""; assertTrue(StringUtil.isEmpty(s)) ; assertTrue(StringUtil.isBlank(s)); s = " "; assertTrue(StringUtil.isEmpty(s)) ; assertTrue(StringUtil.isBlank(s)) ; s = " "; assertTrue(StringUtil.isEmpty(s)) ; assertTrue(StringUtil.isBlank(s)) ; s = " a "; assertTrue(StringUtil.isEmpty(s)==false) ; assertTrue(StringUtil.isBlank(s)==false) ; }
Update: It was a really interesting discussion - and this is why I love Stack Overflow and the folks here. By the way, coming back to the question, we got:
isBlank()
. Thanks @devconsole.Java String isEmpty() Method The isEmpty() method checks whether a string is empty or not. This method returns true if the string is empty (length() is 0), and false if not.
The Java programming language distinguishes between null and empty strings. An empty string is a string instance of zero length, whereas a null string has no value at all. An empty string is represented as "" . It is a character sequence of zero characters.
isEmpty(<string>) Checks if the <string> value is an empty string containing no characters or whitespace. Returns true if the string is null or empty.
Is there a string that will make the
isEmpty
andisBlank
behave differently in a test case?
Note that Character.isWhitespace
can recognize Unicode characters and return true
for Unicode whitespace characters.
Determines if the specified character is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:
It is a Unicode space character (
SPACE_SEPARATOR
,LINE_SEPARATOR
, orPARAGRAPH_SEPARATOR
) but is not also a non-breaking space ('\u00A0'
,'\u2007'
,'\u202F'
).
[...]
On the other hand, trim()
method would trim all control characters whose code points are below U+0020 and the space character (U+0020).
Therefore, the two methods would behave differently at presence of a Unicode whitespace character. For example: "\u2008"
. Or when the string contains control characters that are not consider whitespace by Character.isWhitespace
method. For example: "\002"
.
If you were to write a regular expression to do this (which is slower than doing a loop through the string and check):
isEmpty()
would be equivalent to .matches("[\\x00-\\x20]*")
isBlank()
would be equivalent to .matches("\\p{javaWhitespace}*")
(The isEmpty()
and isBlank()
method both allow for null
String reference, so it is not exactly equivalent to the regex solution, but putting that aside, it is equivalent).
Note that \p{javaWhitespace}
, as its name implied, is Java-specific syntax to access the character class defined by Character.isWhitespace
method.
Assuming there are none, is there any other consideration because of which I should choose
isBlank
and not useisEmpty
?
It depends. However, I think the explanation in the part above should be sufficient for you to decide. To sum up the difference:
isEmpty()
will consider the string is empty if it contains only control characters1 below U+0020 and space character (U+0020)
isBlank
will consider the string is empty if it contains only whitespace characters as defined by Character.isWhitespace
method, which includes Unicode whitespace characters.
1 There is also the control character at U+007F DELETE
, which is not trimmed by trim()
method.
The purpose of the two standard methods is to distinguish between this two cases:
org.apache.common.lang.StringUtils.isBlank(" ")
(will return true).
org.apache.common.lang.StringUtils.isEmpty(" ")
(will return false).
Your custom implementation of isEmpty()
will return true.
UPDATE:
org.apache.common.lang.StringUtils.isEmpty()
is used to find if the String is length 0 or null.
org.apache.common.lang.StringUtils.isBlank()
takes it a step forward. It not only checks if the String is length 0 or null, but also checks if it is only a whitespace string.
In your case, you're trimming the String in your isEmpty
method. The only difference that can occur now can't occur (the case you gives it " "
) because you're trimming it (Removing the trailing whitespace - which is in this case is like removing all spaces).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With