I'm very confused and I think my debugger is lying to me. I have the following loop in my code:
MyClass::UploadFile(CString strFile)
{
...
static DWORD dwLockWaitTime = EngKey::GetDWORD(DNENG_SERVER_UPLOAD_LOCK_WAIT_TIME, DNENG_SERVER_UPLOAD_LOCK_WAIT_TIME_DEFAULT);
static DWORD dwLockPollInterval = EngKey::GetDWORD(DNENG_SERVER_UPLOAD_LOCK_POLL_INTERVAL, DNENG_SERVER_UPLOAD_LOCK_POLL_INTERVAL_DEFAULT);
LONGLONG llReturnedOffset(0LL);
BOOL bLocked(FALSE);
for (DWORD sanity = 0; (sanity == 0 || status == RESUMABLE_FILE_LOCKED) && sanity < (dwLockWaitTime / dwLockPollInterval); sanity++)
{
...
This loop has been executed hundreds of times during the course of my program and the two static variables are not changed anywhere in the code, they're written to just once when they're statically initialized and read from in the loop conditions and in one other place. Since they're user settings which are read from the Windows registry they almost always have the constant values of dwLockWaitTime = 60 and dwLockPollInterval = 5. So the loop is always doing 60 / 5.
Very rarely, I get a crash dump which shows that this line of code has thrown a division by zero error. I've checked what WinDbg says and it shows:
FAULTING_IP:
procname!CServerAgent::ResumableUpload+54a [serveragent.cpp @ 725]
00000001`3f72d74a f73570151c00 div eax,dword ptr [proc!dwLockPollInterval (00000001`3f8eecc0)]
EXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000000013f72d74a (proc!CServerAgent::ResumableUpload+0x000000000000054a)
ExceptionCode: c0000094 (Integer divide-by-zero)
ExceptionFlags: 00000000
NumberParameters: 0
ERROR_CODE: (NTSTATUS) 0xc0000094 - {EXCEPTION} Integer division by zero.
I've checked the assembler code and it shows that the crash occurred on this div instruction.
00000001`3f72d744 8b0572151c00 mov eax,dword ptr [dwLockWaitTime (00000001`3f8eecbc)]
00000001`3f72d74a f73570151c00 div eax,dword ptr [dwLockPollInterval (00000001`3f8eecc0)]
So as you can see the value at 000000013f8eecbc
was moved into eax
and then eax
was divided by the value at 000000013f8eecc0
.
What is at those two values you ask?
0:048> dd 00000001`3f8eecbc
00000001`3f8eecbc 0000003c 00000005 00000001 00000000
00000001`3f8eeccc 00000000 00000002 00000000 00000000
00000001`3f8eecdc 00000000 7fffffff a9ad25cf 7fffffff
00000001`3f8eecec a9ad25cf 00000000 00000000 00000000
00000001`3f8eecfc 00000000 00000000 00000000 00000000
00000001`3f8eed0c 00000000 00000000 00000000 00000000
00000001`3f8eed1c 00000000 00000000 00000000 00000000
00000001`3f8eed2c 00000000 00000000 00000000 00000000
0:048> dd 000000013f8eecc0
00000001`3f8eecc0 00000005 00000001 00000000 00000000
00000001`3f8eecd0 00000002 00000000 00000000 00000000
00000001`3f8eece0 7fffffff a9ad25cf 7fffffff a9ad25cf
00000001`3f8eecf0 00000000 00000000 00000000 00000000
00000001`3f8eed00 00000000 00000000 00000000 00000000
00000001`3f8eed10 00000000 00000000 00000000 00000000
00000001`3f8eed20 00000000 00000000 00000000 00000000
00000001`3f8eed30 00000000 00000000 00000000 00000000
The constants 60
and 5
exactly as I'd expect. So where's the divide by zero??? Is my debugger lying? Surely the divide by zero has been thrown by the hardware so it can't have made a mistake about that? And if it was a divide by zero in a different place in my code what are the odds that the debugger would show the instruction pointer in exactly this place? I confess, I'm stumped..
Any number divided by zero gives the answer “equal to infinity.” Unfortunately, no data structure in the world of programming can store an infinite amount of data. Hence, if any number is divided by zero, we get the arithmetic exception .
In languages like C, C++ etc. division by zero invokes undefined behaviour.
Handling the Divide by Zero Exception in C++Dividing a number by Zero is a mathematical error (not defined) and we can use exception handling to gracefully overcome such operations.
Since the code is part of a member function, and you're calling this function from multiple threads, the static
variables are not thread-safe if using a compiler that does not conform to C++ 11 standards. Thus you may get data races when initializing those two static variables.
For a C++ 11 standard conforming compiler, static variables are now guaranteed to be initialized by the first thread, while subsequent threads wait until the static is initialized.
For Visual Studio 2010
and below, static local variables are not guaranteed to be thread safe, since these compilers conform to the C++ 03 and C++ 98 standard.
For Visual Studio 2013
, I am not sure of the level of C++ 11 support in terms of static local initialization. Therefore, for Visual Studio 2013, you may have to use proper synchronization to ensure that static local variables are initialized correctly.
For Visual Studio 2015
, this item has been addressed and proper static local initialization is fully implemented, so the code you currently have should work correctly for VS 2015 and above.
Edit: For Visual Studio 2013
, static local thread-safe initialization is not implemented ("Magic Statics"), as described here.
Therefore, we can cautiously verify that the reason for the original problem is the static-local initialization issue and threading. So the solution (if you want to stick with VS 2013) is to use proper synchronization, or redesign your application so that static variables are no longer needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With