Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ works fine at my computer but gets address sanitizer heap-buffer-overflow errors on leetcode

Tags:

c++

I was trying to solve leetcode problem "929. Unique Email Addresses", the code works fine at my computer on Visual Studio Code but when I pasted it on the leetcode, I got address sanitizer heap-buffer-overflow error. The code is shown below:

class Solution {
 public:
  int numUniqueEmails(vector<string>& emails) {
    string::iterator it;
    for (int i = 0; i < emails.size(); i++) {
      for (it = emails[i].begin(); *it != '@'; it++) {
        if (*it == '.') {
          emails[i].erase(it);
          it--;
        }
        if (*it == '+') {
          while (*it != '@') {
            emails[i].erase(it);
          }
          break;
        }
      }
      sort(emails.begin(), emails.end());
      for (int j = 0; j < emails.size(); j++) {
        if (emails[j] == emails[j + 1]) {
          emails.erase(emails.begin() + j);
          j--;
        }
      }
    }
    return emails.size();
  }
};

Can someone let me know why the error occurs?

like image 207
Liu john Avatar asked Feb 24 '19 21:02

Liu john


People also ask

What causes heap buffer overflow in C?

A heap overflow is a form of buffer overflow; it happens when a chunk of memory is allocated to the heap and data is written to this memory without any bound checking being done on the data.

How serious is heap overflow?

Consequences. An accidental overflow may result in data corruption or unexpected behavior by any process that accesses the affected memory area. On operating systems without memory protection, this could be any process on the system.

What is heap buffer overflow error?

A heap overflow condition is a buffer overflow, where the buffer that can be overwritten is allocated in the heap portion of memory, generally meaning that the buffer was allocated using a routine such as malloc().


1 Answers

Because the following loops:

for (it = emails[i].begin(); *it != '@'; it++)
   ...

and

while (*it != '@')
  ...

iterate with it over the string without never ever verifying if the end of the string was reached. If you have a single ill-formatted email address in the input data, you therefore go out of bounds. So UB.

In addition, you also go out of bounds here:

 for (int j = 0; j < emails.size(); j++) {
    if (emails[j] == emails[j + 1]) {  // what when j == size-1 ????

Finally a potentially nasty issue as well (but could affect the result): your first for loop should end before the sort(). Why? because with some bad luck, the sorting will change the order of the elements, causing an unprocessed address to move before the current i index and thus remain unprocessed.

Demo with some additional display of intermediary results and some extreme cases.

like image 70
Christophe Avatar answered Nov 11 '22 02:11

Christophe