Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does THREAD_MODE_BACKGROUND_BEGIN cause my code to run 20x slower than THREAD_PRIORITY_LOWEST?

So I won't bore you with why, but my application can optionally perform some integrity checking on very large files (Up to 50gb) using CRC. Because I don't want to kill user's machines if they turn this option on I set the IoPriorityHintVeryLow hint on the handle and also was setting the thread priority to THREAD_MODE_BACKGROUND_BEGIN using this API.

The time consuming part of my code looks like this:

//
// Read one block of the changed data at a time, checking each CRC
//
DWORD blockNum = 0;
vector<BYTE> changeBuffer(DIRTY_BLOCK_SIZE);
outputDirtyBlockMap.reserve(crcList.size() / 8);
while (::ReadFile(hChangedFile, changeBuffer.data(), DIRTY_BLOCK_SIZE, &bytesRead, NULL) && bytesRead > 0)
{
    // Check for cancellation every 500 blocks, doing it every time reduces CPU performance by 50% since WaitForSingleObject is quite expensive
    if ((blockNum % 500 == 0) && IsCancelEventSignalled(hCancel))
    {
        RETURN_TRACED(ERROR_CANCELLED);
    }

    // Increase the size of the dirty block map?
    ULONG mapByte = blockNum / 8;
    if (mapByte == outputDirtyBlockMap.size())
    {
        outputDirtyBlockMap.resize(mapByte + 1);
    }

    DWORD mapBitNum = blockNum & 0x7L;
    UCHAR mapBit = 1 << (7 - mapBitNum);
    if (driverDirtyBlockMap.size() > mapByte && (driverDirtyBlockMap[mapByte] & mapBit))
    {
        //
        // The bit is already set in the drivers block map, we don't have to bother generating comparing CRCs for this block
        //
        outputDirtyBlockMap[mapByte] |= mapBit;
    }
    else
    {
        // Validate that the CRC hasn't changed, if it has, mark it as such in the dirty block map
        DWORD newCrc = CRC::Crc32(changeBuffer.data(), changeBuffer.size());
        if ((blockNum >= crcList.size() || newCrc != crcList[blockNum]))
        {
            OPTIONAL_DEBUG(DEBUG_DIRTY_BLOCK_MAP & DEBUG_VERBOSE, "Detected change at block [%u], CRC [new 0x%x != old 0x%x]", blockNum, newCrc, blockNum < crcList.size() ? crcList[blockNum] : 0x0);

            // The CRC is changed or the file has grown, mark it as such in the dirty block map
            outputDirtyBlockMap[mapByte] |= mapBit;
        }
    }

    ++blockNum;
}

When I was profiling this code I was very surprised to find that when this loop runs in THREAD_MODE_BACKGROUND_BEGIN it takes 74 seconds to run over a 500Mb file. When running with THREAD_PRIORITY_LOWEST it takes 2.7 seconds to run over a 500Mb file. (I've tested this around 8 times now and that was the average)

In both cases the machine I was testing on was idle other than running this loop. So question:

Why does THREAD_MODE_BACKGROUND_BEGIN make this take so long? I'd have thought that if the machine isn't doing anything else, it should still run as quick as with any other priority because it doesn't need to be prioritized?

Is there something I should know about this priority that I haven't been able to figure out from the docs?

like image 696
Benj Avatar asked Mar 27 '17 16:03

Benj


2 Answers

Setting background mode has the following effects:

  • Sets I/O priority to Very Low
  • Sets Memory Priority to 1
  • Sets Absolute Thread Priority to 4

While setting the relative thread priority to LOWEST has the following effect:

  • Sets Relative Thread Priority to -2 (i.e.: Absolute 6, assuming Normal Process Priority Class)

So, in general, especially if you're I/O bound (but even in cases of being CPU bound), you would definitely expect a thread at priority 4, running with Very Low I/O priority and Background Memory Priority (1) to perform a lot more poorly than a thread with Foreground Memory Priority (5) + Normal I/O Priority at priority 6...

like image 166
Alex Avatar answered Oct 01 '22 00:10

Alex


That THREAD_MODE_* is different from THREAD_PRIORITY_* is maybe not that surprising?

I don't know if the exact differences are documented anywhere but it would not surprise me if background mode tries to run everything on a single core if the CPU supports core parking and at a lower frequency.

The SetThreadPriority documentation also hints to some changes to any I/O the thread performs:

The THREAD_PRIORITY_* values affect the CPU scheduling priority of the thread. For threads that perform background work such as file I/O, network I/O, or data processing, it is not sufficient to adjust the CPU scheduling priority; even an idle CPU priority thread can easily interfere with system responsiveness when it uses the disk and memory. Threads that perform background work should use the THREAD_MODE_BACKGROUND_BEGIN and THREAD_MODE_BACKGROUND_END values to adjust their resource scheduling priorities; threads that interact with the user should not use THREAD_MODE_BACKGROUND_BEGIN.

Have you tried to measure to see if the performance loss is in ReadFile or the CRC calculation?

like image 37
Anders Avatar answered Sep 30 '22 23:09

Anders