Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::set find behavior with char * type

I have below code line:

const char *values[] = { "I", "We", "You", "We"};
std::set<const char*> setValues;

for( int i = 0; i < 3; i++ ) {

    const char *val = values[i];
    std::set<const char*>::iterator it = setValues.find( val );

    if( it == setValues.end() ) {
        setValues.insert( val );
    }
    else {
        cout << "Existing value" << endl;
    }
}

With this I am trying to insert non-repeated values in a set, but somehow code is not hitting to print for existing element and duplicate value is getting inserted.

What is wrong here?

like image 385
user987316 Avatar asked Jul 29 '16 10:07

user987316


3 Answers

The std::set<T>::find uses a default operator < of the type T. Your type is const char*. This is a pointer to an address in memory so the find method just compares address in memory of given string to addresses in memory of all strings from set. These addresses are different for each string (unless compiler optimizes it out).

You need to tell std::set how to compare strings correctly. I can see that AnatolyS already wrote how to do it in his answer.

like image 129
NO_NAME Avatar answered Sep 19 '22 23:09

NO_NAME


You should define less predicate for const char* and pass into the set template to make the set object works correctly with pointers:

  struct cstrless {
    bool operator()(const char* a, const char* b) const {
      return strcmp(a, b) < 0;
    }
  };
  std::set<const char*, cstrless> setValues;
like image 42
AnatolyS Avatar answered Sep 20 '22 23:09

AnatolyS


Unless you use a custom comparison function object, std::set uses operator<(const key_type&,key_type&) by default. Two pointers are equal if, and only if they point to the same object.

Here is an example of three objects:

char a[] = "apple";
char b[] = "apple";
const char (&c)[6] = "apple"

First two are arrays, the third is an lvalue reference that is bound to a string literal object that is also an array. Being separate objects, their address is of course also different. So, if you were to write:

setValues.insert(a)
bool is_in_map = setValues.find("apple") != setValues.end();

The value of is_in_map would be false, because the set contains only the address of the string in a, and not the address of the string in the literal - even though the content of the strings are same.

Solution: Don't use operator< to compare pointers to c strings. Use std::strcmp instead. With std::set, this means using a custom comparison object. However, you aren't done with caveats yet. You must still make sure that the strings stay in memory as long as they are pointed to by the keys in the set. For example, this would be a mistake:

char a[] = "apple";
setValues.insert(a);
return setValues; // oops, we returned setValues outside of the scope
                  // but it contains a pointer to the string that
                  // is no longer valid outside of this scope

Solution: Take care of scope, or just use std::string.

(This answer plagiarises my own answer about std::map here)

like image 42
eerorika Avatar answered Sep 19 '22 23:09

eerorika