Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

new to java - trying to understand: checker |= (1 << val)

Tags:

java

the following code will check to see if you have any duplicate characters in the string, but i don't understand the if clause:

public static boolean isUniqueChars(String str) {
        int checker = 0;
        for (int i = 0; i < str.length(); ++i) {
            int val = str.charAt(i) - 'a';
            if ((checker & (1 << val)) > 0) 
                return false;
            checker |= (1 << val);
        }
        return true;
    }

I tried to look up some references, I am new to bit shifting, all i understand is that << shifts the binary number left or right. Can you explain to me how checker |= (1 << val) works ? and that 'if' statement as well.

like image 221
shamed fred Avatar asked Sep 15 '14 11:09

shamed fred


3 Answers

I was also going through this book Cracking the Code Interview and ended up googling for a clear explanations. Finally I understood the concept.

Here is the approach.

Note :

  1. We will assume, in the below code, that the string is only lower case ‘a’ through ‘z’. This will allow us to use just a single int.

  2. Java integer is of size 32

  3. Number of lower case alphabets is 26

So we can clearly set 0/1 (true or false) value inside one integer in decimal notation.

  1. It is similar to bool visited[32] . bool uses 1 byte. Hence you need 32 bytes for storing bool visited[32].

  2. Bit masking is a space optimization to this.

Lets start :

  1. You are looping through all the characters in the string.
  2. Suppose on i'th iteration you found character 'b' . You calculate its 0 based index.

int val = str.charAt(i) - 'a';

For 'b' it is 1. ie 98-97 .

  1. Now using left shift operator, we find the value of 2^1 => 2.

(1 << val) // 1<<1 => 10(binary)

Now let us see how bitwise & works

0 & 0 -> 0
0 & 1 -> 0
1 & 0 -> 0
1 & 1 -> 1

So by the below code :

(checker & (1 << val))

We check if the checker[val] == 0 . Suppose we had already encountered 'b'.

check = 0000 0000 0000 0000 0000 1000 1000 0010   &  
'b'   = 0000 0000 0000 0000 0000 0000 0000 0010 
 ----------------------------------------------
result= 0000 0000 0000 0000 0000 0000 0000 0010

ie decimal value = 2 which is >0

So you finally we understood this part.

 if ((checker & (1 << val)) > 0) 
                return false;
  1. Now if 'b' was not encountered, then we set the second bit of checker using bitwise OR.

( This part is called as bit masking. )

OR's Truth table

0 | 0 -> 0
0 | 1 -> 1
1 | 0 -> 1
1 | 1 -> 1

So

check = 0000 0000 0000 0000 0000 1000 1000 0000   |  
'b'   = 0000 0000 0000 0000 0000 0000 0000 0010 
 ----------------------------------------------
result= 0000 0000 0000 0000 0000 1000 1000 0010

So that simplifies this part:

checker |= (1 << val);  // checker = checker |  (1 << val);

I hope this helped someone !

like image 197
Arjun Sunil Kumar Avatar answered Oct 31 '22 22:10

Arjun Sunil Kumar


Seems like I am late to the party, but let me take a stab at the explanation.

First of all the AND i.e & operation:

0 & 0 = 0
1 & 0 = 0
0 & 1 = 0
1 & 1 = 1

So basically, if you are given a bit, and you want to find out if its 1 or 0, you just & it with a 1. If the result is 1 then you had a 1, else you had 0. We will use this property of the & below.

The OR i.e | operation

0 | 0 = 0
1 | 0 = 1
0 | 1 = 1
1 | 1 = 1  

So basically, if you are given a bit, and you want to do something to it so that the output is always 1, then you do an | 1 with it.

Now, In Java the type int is 4 bytes i.e. 32 bits. Thus we can use an int itself as a data-structure to store 32 states or booleans in simpler terms, since a bit can either be 0 or 1 i.e false or true. Since we assume that our string is composed of only lower case characters, we have enough space inside our int to store a boolean for each of the 26 chars!

So first we initialize our data-structure that we call checker to 0 which is nothing but 32 zeros: 0000000000000000000000.

So far so good?

Now we go through our string, for each character, first we get an integer representation of the character.

int val = str.charAt(i) - 'a';

We subtract a from it because we want our integer to be 0 based. So if vals:

a = 0 i.e. `0000000000000000000000`
b = 1 i.e. `0000000000000000000001`
c = 2 i.e. `0000000000000000000010`
d = 4 i.e. `0000000000000000000100`

Now as seen above, a is 32 zeros, but rest of the characters have a single 1 and 31 zeros. So when we use these characters, we left shift each of them by 1, i.e. (1 << val), so each of them have a single 1 bit, and 31 zero bits:

a = 1 i.e. `0000000000000000000001`
b = 2 i.e. `0000000000000000000010`
c = 4 i.e. `0000000000000000000100`
d = 8 i.e. `0000000000000000001000`

We are done with the setup. Now we do 2 things:

  1. First assume all characters are different. For every char we encounter, we want our datastructure i.e. checker to have 1 for that char. So we use our OR property descrived above to generate a 1 in our datastructure, and hence we do:

    checker = checker | (1 << val);
    

Thus checker stores 1 for every character we encounter.

  1. Now we come to the part where characters can repeate. So before we do step 1, we want to make sure that the checker already does not have a 1 at the position corresponding to the current character. So we check the value of

    checker & (1 << val)
    

So with help of the AND property described above, if we get a 1 from this operation, then checker already had a 1 at that position, which means we must have encountered this character before. So we immediately return false.

That's it. If all our & checks return 0, we finally return true, meaning there were no character repititions.

like image 27
rgamber Avatar answered Oct 31 '22 22:10

rgamber


1 << val is the same as 2 to the degree of val. So it's a number which has
just one one in its binary representation (the one is at position val+1, if you count from
the right side of the number to the left one).

a |= b means basically this: set in a all binary flags/ones from the
binary representation of b (and keep those in a which were already set).

like image 37
peter.petrov Avatar answered Nov 01 '22 00:11

peter.petrov