Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find repeated words in a vector of strings in C++?

Tags:

c++

string

vector

I have a std::vector<string> where each element is a word. I want to print the vector without repeated words!

I searched a lot on the web and I found lots of material, but I can't and I don't want to use hash maps, iterators and "advanced" (to me) stuff. I can only use plain string comparison == as I am still a beginner.

So, let my_vec a std::vector<std::string> initialized from std input. My idea was to read all the vector and erase any repeated word once I found it:

  for(int i=0;i<my_vec.size();++i){
    for (int j=i+1;j<my_vec.size();++j){
      if(my_vec[i]==my_vec[j]){
        my_vec.erase(my_vec.begin()+j); //remove the component from the vector
      }
    }
  }

I tried to test for std::vector<std::string> my_vec{"hey","how","are","you","fine","and","you","fine"}

and indeed I found

hey how are you fine and

so it seems to be right, but for instance if I write the simple vector std::vector<std::string> my_vec{"hello","hello","hello","hello","hello"}

I obtain

hello hello

The problem is that at every call to erase the dimension gets smaller and so I lose information. How can I do that?

like image 827
Vefhug Avatar asked Dec 23 '22 16:12

Vefhug


2 Answers

Minimalist approach to your existing code. The auto-increment of j is what is ultimately breaking your algorithm. Don't do that. Instead, only increment it when you do NOT remove an element.

I.e.

for (int i = 0; i < my_vec.size(); ++i) {
    for (int j = i + 1; j < my_vec.size(); ) {  // NOTE: no ++j
        if (my_vec[i] == my_vec[j]) {
            my_vec.erase(my_vec.begin() + j);
        }
        else ++j; // NOTE: moved to else-clause
    }
}

That is literally it.

like image 184
WhozCraig Avatar answered Dec 25 '22 05:12

WhozCraig


You can store the element element index to erase and then eliminate it at the end. Or repeat the cycle until no erase are performed.

First code Example:

std::vector<int> index_to_erase();

for(int i=0;i<my_vec.size();++i){
    for (int j=i+1;j<my_vec.size();++j){
      if(my_vec[i]==my_vec[j]){
        index_to_erase.push_back(j);
        
      }
    }
  }
//starting the cycle from the last element to the vector of index, in this 
//way the vector of element remains equal for the first n elements
for (int i = index_to_erase.size()-1; i >= 0; i--){
   my_vec.erase(my_vec.begin()+index_to_erase[i]); //remove the component from the vector
} 

Second code Example:

bool Erase = true;
while(Erase){
  Erase = false;
  for(int i=0;i<my_vec.size();++i){
    for (int j=i+1;j<my_vec.size();++j){
      if(my_vec[i]==my_vec[j]){
        my_vec.erase(my_vec.begin()+j); //remove the component from the vector
        Erase = true;
      }
    }
  }
}
like image 33
Zig Razor Avatar answered Dec 25 '22 05:12

Zig Razor