Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing a string with the empty string (Java)

I have a question about comparing a string with the empty string in Java. Is there a difference, if I compare a string with the empty string with == or equals? For example:

String s1 = "hi";  if (s1 == "") 

or

if (s1.equals(""))  

I know that one should compare strings (and objects in general) with equals, and not ==, but I am wondering whether it matters for the empty string.

like image 864
user42155 Avatar asked Feb 10 '09 10:02

user42155


People also ask

Can we compare string with null in Java?

We can simply compare the string with Null using == relational operator. Print true if the above condition is true.

How do you evaluate an empty string?

Java String isEmpty() Method The isEmpty() method checks whether a string is empty or not. This method returns true if the string is empty (length() is 0), and false if not.

How do you test that a string str is the empty string?

Empty strings contain zero characters and display as double quotes with nothing between them ( "" ). You can determine if a string is an empty string using the == operator. The empty string is a substring of every other string. Therefore, functions such as contains always find the empty string within other strings.

Does isEmpty check for empty string?

isEmpty(<string>)Checks if the <string> value is an empty string containing no characters or whitespace. Returns true if the string is null or empty.


2 Answers

s1 == "" 

is not reliable as it tests reference equality not object equality (and String isn't strictly canonical).

s1.equals("") 

is better but can suffer from null pointer exceptions. Better yet is:

"".equals(s1) 

No null pointer exceptions.

EDIT: Ok, the point was asked about canonical form. This article defines it as:

Suppose we have some set S of objects, with an equivalence relation. A canonical form is given by designating some objects of S to be "in canonical form", such that every object under consideration is equivalent to exactly one object in canonical form.

To give you a practical example: take the set of rational numbers (or "fractions" are they're commonly called). A rational number consists of a numerator and a denomoinator (divisor), both of which are integers. These rational numbers are equivalent:

3/2, 6/4, 24/16

Rational nubmers are typically written such that the gcd (greatest common divisor) is 1. So all of them will be simplified to 3/2. 3/2 can be viewed as the canonical form of this set of rational numbers.

So what does it mean in programming when the term "canonical form" is used? It can mean a couple of things. Take for example this imaginary class:

public class MyInt {   private final int number;    public MyInt(int number) { this.number = number; }   public int hashCode() { return number; } } 

The hash code of the class MyInt is a canonical form of that class because for the set of all instances of MyInt, you can take any two elements m1 and m2 and they will obey the following relation:

m1.equals(m2) == (m1.hashCode() == m2.hashCode()) 

That relation is the essence of canonical form. A more common way this crops up is when you use factory methods on classes such as:

public class MyClass {   private MyClass() { }    public MyClass getInstance(...) { ... } } 

Instances cannot be directly instantiated because the constructor is private. This is just a factory method. What a factory method allows you to do is things like:

  • Always return the same instance (abstracted singleton);
  • Just create a new intsance with every call;
  • Return objects in canonical form (more on this in a second); or
  • whatever you like.

Basically the factory method abstracts object creation and personally I think it would be an interesting language feature to force all constructors to be private to enforce the use of this pattern but I digress.

What you can do with this factory method is cache your instances that you create such that for any two instances s1 and s2 they obey the following test:

(s1 == s2) == s1.equals(s2) 

So when I say String isn't strictly canonical it means that:

String s1 = "blah"; String s2 = "blah"; System.out.println(s1 == s2); // true 

But as others have poitned out you can change this by using:

String s3 = new String("blah"); 

and possibly:

String s4 = String.intern("blah"); 

So you can't rely on reference equality completely so you shouldn't rely on it at all.

As a caveat to the above pattern, I should point out that controlling object creation with private constructors and factory methods doesn't guarantee reference equality means object equality because of serialization. Serialization bypasses the normal object creation mechanism. Josh Bloch covers this topic in Effective Java (originally in the first edition when he talked about the typesafe enum pattern which later became a language feature in Java 5) and you can get around it by overloading the (private) readResolve() method. But it's tricky. Class loaders will affect the issue too.

Anyway, that's canonical form.

like image 72
cletus Avatar answered Sep 30 '22 15:09

cletus


It's going to depend on if the string is a literal or not. If you create the string with

new String("") 

Then it will never match "" with the equals operator, as shown below:

    String one = "";     String two = new String("");     System.out.println("one == \"\": " + (one == ""));     System.out.println("one.equals(\"\"): " + one.equals(""));     System.out.println("two == \"\": " + (two == ""));     System.out.println("two.equals(\"\"): " + two.equals("")); 

--

one == "": true one.equals(""): true two == "": false two.equals(""): true 

Basically, you want to always use equals()

like image 23
tddmonkey Avatar answered Sep 30 '22 15:09

tddmonkey