Clearing memory securely and reallocations

Question

Following the discussion here, if you want to have a secure class for storing sensitive information (e.g passwords) on memory, you have to:

memset/clear the memory before freeing it
reallocations must also follow the same rule - instead of using realloc, use malloc to create a new memory region, copy the old to the new, and then memset/clear the old memory before freeing it finally

So this sounds good, and I created a test class to see if this works. So I made a simple test case where I keep adding the words "LOL" and "WUT", followed by a number to this secure buffer class around a thousand times, destroying that object, before finally doing something that causes a core dump.

Since the class is supposed to securely clear the memory before the destruction, I'm not supposed to be able to find a "LOLWUT" on the coredump. However, I managed to find them still, and wondered if my implementation is just buggy. However, I tried the same thing using CryptoPP library's SecByteBlock:

#include <cryptopp/osrng.h> #include <cryptopp/dh.h> #include <cryptopp/sha.h> #include <cryptopp/aes.h> #include <cryptopp/modes.h> #include <cryptopp/filters.h> #include <stdlib.h> #include <stdio.h> #include <string.h> using namespace std;  int main(){    {       CryptoPP::SecByteBlock moo;        int i;       for(i = 0; i < 234; i++){          moo += (CryptoPP::SecByteBlock((byte*)"LOL", 3));          moo += (CryptoPP::SecByteBlock((byte*)"WUT", 3));           char buffer[33];          sprintf(buffer, "%d", i);          string thenumber (buffer);           moo += (CryptoPP::SecByteBlock((byte*)thenumber.c_str(), thenumber.size()));       }        moo.CleanNew(0);     }     sleep(1);     *((int*)NULL) = 1;     return 0; }

And then compile using:

g++ clearer.cpp -lcryptopp -O0

And then enable core dump

ulimit -c 99999999

But then, enabling core dump and running it

./a.out ; grep LOLWUT core ; echo hello

gives the following output

Segmentation fault (core dumped) Binary file core matches hello

What is causing this? Did the whole memory region for the application realloc itself, because of the reallocation caused by SecByteBlock's append?

Also, This is SecByteBlock's Documentation

edit: After checking the core dump using vim, I got this: http://imgur.com/owkaw

edit2: updated code so it's more readily compilable, and compilation instructions

final edit3: It looks like memcpy is the culprit. See Rasmus' mymemcpy implementation on his answer below.

andrewdotn · Accepted Answer

Despite showing up in the coredump, the password isn’t actually in memory anymore after clearing the buffers. The problem is that memcpying a sufficiently long string leaks the password into SSE registers, and those are what show up in the coredump.

When the size argument to memcpy is greater than a certain threshold—80 bytes on the mac—then SSE instructions are used to do the memory copying. These instructions are faster because they can copy 16 bytes at a time in parallel instead of going character-by-character, byte-by-byte, or word-by-word. Here’s the key part of the source code from Libc on the mac:

LAlignedLoop:               // loop over 64-byte chunks     movdqa  (%rsi,%rcx),%xmm0     movdqa  16(%rsi,%rcx),%xmm1     movdqa  32(%rsi,%rcx),%xmm2     movdqa  48(%rsi,%rcx),%xmm3      movdqa  %xmm0,(%rdi,%rcx)     movdqa  %xmm1,16(%rdi,%rcx)     movdqa  %xmm2,32(%rdi,%rcx)     movdqa  %xmm3,48(%rdi,%rcx)      addq    $64,%rcx     jnz     LAlignedLoop      jmp     LShort                  // copy remaining 0..63 bytes and done

%rcx is the loop index register, %rsi is the source address register, and %rdi is the destination address register. Each run around the loop, 64 bytes are copied from the source buffer to the 4 16-byte SSE registers xmm{0,1,2,3}; then the values in those registers are copied to the destination buffer.

There’s a lot more stuff in that source file to make sure that copies occur only on aligned addresses, to fill in the part of the copy that’s leftover after doing 64-byte chunks, and to handle the case where source and destination overlap.

However—the SSE registers are not cleared after use! That means 64 bytes of the buffer that was copied is still present in the xmm{0,1,2,3} registers.

Here’s a modification of Rasmus’s program that shows this:

#include <ctype.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <emmintrin.h>  inline void SecureWipeBuffer(char* buf, size_t n){   volatile char* p = buf;   asm volatile("rep stosb" : "+c"(n), "+D"(p) : "a"(0) : "memory"); }  int main(){   const size_t size1 = 200;   const size_t size2 = 400;    char* b = new char[size1];   for(int j=0;j<size1-10;j+=10){     memcpy(b+j, "LOL", 3);     memcpy(b+j+3, "WUT", 3);     sprintf((char*) (b+j+6), "%d", j);   }   char* nb = new char[size2];   memcpy(nb, b, size1);   SecureWipeBuffer(b,size1);   SecureWipeBuffer(nb,size2);    /* Password is now in SSE registers used by memcpy() */   union {     __m128i a[4];     char c;   };   asm ("MOVDQA %%xmm0, %0": "=x"(a[0]));   asm ("MOVDQA %%xmm1, %0": "=x"(a[1]));   asm ("MOVDQA %%xmm2, %0": "=x"(a[2]));   asm ("MOVDQA %%xmm3, %0": "=x"(a[3]));   for (int i = 0; i < 64; i++) {       char p = *(&c + i);       if (isprint(p)) {         putchar(p);       } else {           printf("\%x", p);       }   }   putchar('
');    return 0; }

On my mac, this prints:

0\0LOLWUT130\0LOLWUT140\0LOLWUT150\0LOLWUT160\0LOLWUT170\0LOLWUT180\0\0\0

Now, examining the core dump, the password only occurs one single time, and as that exact 0\0LOLWUT130\0...180\0\0\0 string. The core dump has to contain a copy of all registers, which is why that string is there—it’s the values of the xmm{0,1,2,4} registers.

So the password isn’t actually in RAM anymore after calling SecureWipeBuffer, it only appears to be because it is actually in some registers that only appear in the coredump. If you’re worried about memcpy having a vulnerability that could be exploited by RAM-freezing, worry no more. If having a copy of the password in registers bothers you, use a modified memcpy that doesn’t use the SSE2 registers, or clears them when it’s done. And if you’re really paranoid about this, keep testing your coredumps to make sure the compiler isn’t optimizing away your password-clearing code.

Rasmus Faber · Answer

Here is another program that reproduces the problem more directly:

#include <stdlib.h> #include <stdio.h> #include <string.h>  inline void SecureWipeBuffer(char* buf, size_t n){   volatile char* p = buf;   asm volatile("rep stosb" : "+c"(n), "+D"(p) : "a"(0) : "memory"); }  void mymemcpy(char* b, const char* a, size_t n){   char* s1 = b;   const char* s2= a;   for(; 0<n; --n) *s1++ = *s2++; }  int main(){   const size_t size1 = 200;   const size_t size2 = 400;    char* b = new char[size1];   for(int j=0;j<size1-10;j+=10){     memcpy(b+j, "LOL", 3);     memcpy(b+j+3, "WUT", 3);     sprintf((char*) (b+j+6), "%d", j);   }   char* nb = new char[size2];   memcpy(nb, b, size1);   //mymemcpy(nb, b, size1);   SecureWipeBuffer(b,size1);   SecureWipeBuffer(nb,size2);    *((int*)NULL) = 1;    return 0;     }

If you replace memcpy with mymemcpy or use smaller sizes the problem goes away, so my best guess is that the builtin memcpy does something that leaves part of the copied data in memory.

I guess this just shows that clearing sensitive data from memory is practically impossible unless it is designed into the entire system from scratch.

Clearing memory securely and reallocations

Tags:

kamziro

2 Answers

andrewdotn

Rasmus Faber

Recent Activity

Donate For Us

Clearing memory securely and reallocations

Tags:

kamziro

2 Answers

andrewdotn

Rasmus Faber

Related questions

Recent Activity

Donate For Us