Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing duplicate character from array

While reading one book named Cracking the coding interview by Gayle Laakmann, i came across this question

Design an algorithm and write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine. An extra copy of the array is not.

and this code :-

 public static void removeDuplicates(char[] str) {
        if (str == null) {
            return;
        }
        int len = str.length;
        if (len < 2) {
            return;
        }

        int tail = 1;

        for (int i = 1; i < len; ++i) {
            int j;
            for (j = 0; j < tail; ++j) {
                if (str[i] == str[j]) {
                    break;
                }
            }
            if (j == tail) {
                str[tail] = str[i];
                ++tail;
            }
        }
        str[tail] = 0;
    }

which is supposed to remove duplicate character from the array. I don't quiet seem to understand what the algorithm is doing by replacing the same character again and again. I thought it's only me who feels that the algorithm is not working but infact when i ran this code it's giving me wrong outputs. Is this serious error in book or have i not understood the question?

like image 894
TCM Avatar asked Aug 03 '10 15:08

TCM


4 Answers

Algo seems to be working but not clearing leftover characters. Changed code to following and it works: Note: Replaced:

str[tail] = 0;

with :

    for(; tail < len;tail++){
        str[tail] = 0;
    }

public static void removeDuplicates(char[] str) {
        if (str == null) {
            return;
        }
        int len = str.length;
        if (len < 2) {
            return;
        }

        int tail = 1;

        for (int i = 1; i < len; ++i) {
            int j;
            for (j = 0; j < tail; ++j) {
                if (str[i] == str[j]) {
                    break;
                }
            }

            if (j == tail) {
                str[tail] = str[i];
                ++tail;
            }

        }
        for(; tail < len;tail++){
            str[tail] = 0;
        }

    }
like image 199
YoK Avatar answered Sep 29 '22 09:09

YoK


A solution using a bit-vector.

Time: O(n), where n = length of the string

Space: O(1)

void removeduplicatas(char str[]){
    int i, checker = 0, bitvalue = 0, value = 0, tail = 0;
    i = 0;
    tail = 0;
    while(str[i]){
        value = str[i] - 'a';
        bitvalue = 1 << value;
        if((checker & bitvalue) == 0 ){
            str[tail++] = str[i];
            checker |= bitvalue;
        }
        i++;
    }
    str[tail] = '\0';
}
like image 41
Thiago Avatar answered Sep 29 '22 09:09

Thiago


In Java arrays are of fixed size. So the called function cannot change the size of the input array if it finds any duplicates. Your function is just making the start index of the sub-array which has duplicates to 0. So when you print the array contents in the calling function the element which has been made 0 does not get printed but elements following it (if any) do get printed.

The answer by YoK makes all the elements of the sub-array that are duplicates to 0. So that when you print it in the calling function the duplicates don't get printed. But you need to remember that the size of the array is still unchanged.

Alternatively you can return the size of the sub-array which has unique characters. Which in your case is tail.

One more alternative is to pass the input as a StringBuffer and make the changes in-place as:

public static void removeDuplicates(StringBuffer str) {                        

        int len = str.length();

        // if the string as less than 2 char then it can't have duplicates.
        if (len < 2) {                         
                return;
        }

        // fist character will never be duplicate.
        // tail is the index of the next unique character.
        int tail = 1;

        // iterate from 2nd character.
        for (int i = 1; i < len; ++i) {
                int j;

                // is char at index i already in my list of uniq char?
                for (j = 0; j < tail; ++j) {
                        if (str.charAt(i) == str.charAt(j)) {
                                break;
                        }      
                }

                // if no then add it to my uniq char list.
                if (j == tail) {                       
                        str.setCharAt(tail, str.charAt(i));

                        // increment tail as we just added a new ele.
                        ++tail;
                }
        }
        // at this point the characters from index [0,tail) are unique
        // if there were any duplicates they are between [tail,input.length)
        // so truncate the length of input to tail.
        str.setLength(tail);
}

Ideone Link

like image 25
codaddict Avatar answered Sep 29 '22 08:09

codaddict


This is one solution using C++ and recursion to loop through each character of the string and using the above method of bitstring in a fixed width char. You need to make sure that the fixed wide string is longer than the necessary k type characters to check.

#include <cstdint>
#include <iostream>

bool CheckUniqueChars(char *string, uint32_t index, uint32_t checker){

char character = string[index];

if(character=='\0'){
    return true;
}else{
    int value = character - 'a';

    if((checker&(1<<value))>0){
        return false;
    }else{
       checker |= (1<<value);
       return CheckUniqueChars(string,++index,checker);
    }
   }
}


int main(int argc, char *argv[]){

    char *string = argv[1];
    uint32_t idx=0,checker=0;

 if(CheckUniqueChars(string,idx,checker)){
        std::cout << "all characters are unique" << std::endl;
 }else{
    std::cout << "there are duplicate characters" << std::endl;
 }

 return 0;
}
like image 28
Ethan Lim Avatar answered Sep 29 '22 09:09

Ethan Lim