Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking for a not null, not blank String in Java

Tags:

java

string

I am trying to check if a Java String is not null, not empty and not whitespace.

In my mind, this code should have been quite up for the job.

public static boolean isEmpty(String s) {     if ((s != null) && (s.trim().length() > 0))         return false;     else         return true; } 

As per documentation, String.trim() should work thus:

Returns a copy of the string, with leading and trailing whitespace omitted.

If this String object represents an empty character sequence, or the first and last characters of character sequence represented by this String object both have codes greater than '\u0020' (the space character), then a reference to this String object is returned.

However, apache/commons/lang/StringUtils.java does it a little differently.

public static boolean isBlank(String str) {     int strLen;     if (str == null || (strLen = str.length()) == 0) {         return true;     }     for (int i = 0; i < strLen; i++) {         if ((Character.isWhitespace(str.charAt(i)) == false)) {             return false;         }     }     return true; } 

As per documentation, Character.isWhitespace():

Determines if the specified character is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:

  • It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').
  • It is '\t', U+0009 HORIZONTAL TABULATION.
  • It is '\n', U+000A LINE FEED.
  • It is '\u000B', U+000B VERTICAL TABULATION.
  • It is '\f', U+000C FORM FEED.
  • It is '\r', U+000D CARRIAGE RETURN.
  • It is '\u001C', U+001C FILE SEPARATOR.
  • It is '\u001D', U+001D GROUP SEPARATOR.
  • It is '\u001E', U+001E RECORD SEPARATOR.
  • It is '\u001F', U+001F UNIT SEPARATOR.

If I am not mistaken - or might be I am just not reading it correctly - the String.trim() should take away any of the characters that are being checked by Character.isWhiteSpace(). All of them see to be above '\u0020'.

In this case, the simpler isEmpty function seems to be covering all the scenarios that the lengthier isBlank is covering.

  1. Is there a string that will make the isEmpty and isBlank behave differently in a test case?
  2. Assuming there are none, is there any other consideration because of which I should choose isBlank and not use isEmpty?

For those interested in actually running a test, here are the methods and unit tests.

public class StringUtil {      public static boolean isEmpty(String s) {         if ((s != null) && (s.trim().length() > 0))             return false;         else             return true;     }      public static boolean isBlank(String str) {         int strLen;         if (str == null || (strLen = str.length()) == 0) {             return true;         }         for (int i = 0; i < strLen; i++) {             if ((Character.isWhitespace(str.charAt(i)) == false)) {                 return false;             }         }         return true;     } } 

And unit tests

@Test public void test() {          String s = null;      assertTrue(StringUtil.isEmpty(s)) ;     assertTrue(StringUtil.isBlank(s)) ;          s = "";      assertTrue(StringUtil.isEmpty(s)) ;     assertTrue(StringUtil.isBlank(s));           s = " ";      assertTrue(StringUtil.isEmpty(s)) ;     assertTrue(StringUtil.isBlank(s)) ;          s = "   ";      assertTrue(StringUtil.isEmpty(s)) ;     assertTrue(StringUtil.isBlank(s)) ;          s = "   a     ";      assertTrue(StringUtil.isEmpty(s)==false) ;         assertTrue(StringUtil.isBlank(s)==false) ;             } 

Update: It was a really interesting discussion - and this is why I love Stack Overflow and the folks here. By the way, coming back to the question, we got:

  • A program showing which all characters will make the behave differently. The code is at https://ideone.com/ELY5Wv. Thanks @Dukeling.
  • A performance related reason for choosing the standard isBlank(). Thanks @devconsole.
  • A comprehensive explanation by @nhahtdh. Thanks mate.
like image 538
partha Avatar asked May 06 '13 08:05

partha


People also ask

How do you check if a string is not blank in Java?

Java String isEmpty() Method The isEmpty() method checks whether a string is empty or not. This method returns true if the string is empty (length() is 0), and false if not.

Is null or empty string Java?

The Java programming language distinguishes between null and empty strings. An empty string is a string instance of zero length, whereas a null string has no value at all. An empty string is represented as "" . It is a character sequence of zero characters.

Does string isEmpty check for null?

isEmpty(<string>) Checks if the <string> value is an empty string containing no characters or whitespace. Returns true if the string is null or empty.


2 Answers

Is there a string that will make the isEmpty and isBlank behave differently in a test case?

Note that Character.isWhitespace can recognize Unicode characters and return true for Unicode whitespace characters.

Determines if the specified character is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:

  • It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').

  • [...]

On the other hand, trim() method would trim all control characters whose code points are below U+0020 and the space character (U+0020).

Therefore, the two methods would behave differently at presence of a Unicode whitespace character. For example: "\u2008". Or when the string contains control characters that are not consider whitespace by Character.isWhitespace method. For example: "\002".

If you were to write a regular expression to do this (which is slower than doing a loop through the string and check):

  • isEmpty() would be equivalent to .matches("[\\x00-\\x20]*")
  • isBlank() would be equivalent to .matches("\\p{javaWhitespace}*")

(The isEmpty() and isBlank() method both allow for null String reference, so it is not exactly equivalent to the regex solution, but putting that aside, it is equivalent).

Note that \p{javaWhitespace}, as its name implied, is Java-specific syntax to access the character class defined by Character.isWhitespace method.

Assuming there are none, is there any other consideration because of which I should choose isBlank and not use isEmpty?

It depends. However, I think the explanation in the part above should be sufficient for you to decide. To sum up the difference:

  • isEmpty() will consider the string is empty if it contains only control characters1 below U+0020 and space character (U+0020)

  • isBlank will consider the string is empty if it contains only whitespace characters as defined by Character.isWhitespace method, which includes Unicode whitespace characters.

1 There is also the control character at U+007F DELETE, which is not trimmed by trim() method.

like image 116
nhahtdh Avatar answered Sep 29 '22 13:09

nhahtdh


The purpose of the two standard methods is to distinguish between this two cases:

org.apache.common.lang.StringUtils.isBlank(" ") (will return true).

org.apache.common.lang.StringUtils.isEmpty(" ") (will return false).

Your custom implementation of isEmpty() will return true.


UPDATE:

  • org.apache.common.lang.StringUtils.isEmpty() is used to find if the String is length 0 or null.

  • org.apache.common.lang.StringUtils.isBlank() takes it a step forward. It not only checks if the String is length 0 or null, but also checks if it is only a whitespace string.

In your case, you're trimming the String in your isEmpty method. The only difference that can occur now can't occur (the case you gives it " ") because you're trimming it (Removing the trailing whitespace - which is in this case is like removing all spaces).

like image 40
Maroun Avatar answered Sep 29 '22 14:09

Maroun