Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can the Duplicate Characters in a string be Identified and Quantified in O(n)?

This comment suggests that there is a O(n) alternative to my O(n log n) solution to this problem:

Given string str("helloWorld") the expected output is:

l = 3
o = 2

My solution was to do this:

sort(begin(str), end(str));

for(auto start = adjacent_find(cbegin(str), cend(str)), finish = upper_bound(start, cend(str), *start); start != cend(str); start = adjacent_find(finish, cend(str)), finish = upper_bound(start, cend(str), *start)) {
   cout << *start << " = " << distance(start, finish) << endl;
}

Which is obviously limited by the sorting of str. I think this would require a bucket sort solution? Is there anything more clever that I'm missing?

like image 662
Jonathan Mee Avatar asked Jan 02 '18 16:01

Jonathan Mee


People also ask

How do you find duplicate characters in a string in C++?

Algorithm. Define a string and take the string as input form the user. Two loops will be used to find the duplicate characters. Outer loop will be used to select a character and then initialize variable count by 1 its inside the outer loop so that the count is updated to 1 for every new character.


2 Answers

Here's one way, which is O(N) at the expense of maintaining storage for every possible char value.

#include <string>
#include <limits.h> // for CHAR_MIN and CHAR_MAX. Old habits die hard.

int main()
{
    std::string s("Hello World");        
    int storage[CHAR_MAX - CHAR_MIN + 1] = {};
    for (auto c : s){
        ++storage[c - CHAR_MIN];
    }

    for (int c = CHAR_MIN; c <= CHAR_MAX; ++c){
        if (storage[c - CHAR_MIN] > 1){
            std::cout << (char)c << " " << storage[c - CHAR_MIN] << "\n";
        }
    }    
}

This portable solution is complicated by the fact that char can be signed or unsigned.

like image 70
Bathsheba Avatar answered Oct 02 '22 04:10

Bathsheba


Here is what @bathsheba mentioned and with improvements by @Holt:

#include <string>
#include <climits>
#include <iostream>

void show_dup(const std::string& str) {
    const int sz = CHAR_MAX - CHAR_MIN + 1;
    int all_chars[sz] = { 0 };
    // O(N), N - the length of input string
    for(char c : str) {
        int idx = (int)c;
        all_chars[idx]++;
    }
    // O(sz) - constant. For ASCII char it will be 256
    for(int i = 0; i < sz; i++) {
        if (all_chars[i] > 1) {
            std::cout << (char)i << " = " << all_chars[i] << std::endl;
        }
    }
}

int main()
{
  std::string str("helloWorld");

  show_dup(str);
}
like image 30
Artavazd Balayan Avatar answered Oct 01 '22 04:10

Artavazd Balayan