Why threads starve even on preemptive multitasking OS (Windows 7)

Tags:

I wrote a Win32 application (in Delphi-7 which is 32-bit using TThread class) to create 100 threads. Each thread when resumed will continuously (in a loop) increment a 64 bit counter associated with the thread object (so no locking or sharing of data).

If you let the system run for 10 to 15 seconds and stop after that, you should see roughly the same counts in each of the threads. But what I observed was that 81 threads ran under 400 million loops and the remaining ones looped more than 950 million times. Slowest thread got only 230 million compared to the fastest 2111 million.

According to MSDN, the preemptive multitasking is at the thread-level (not process level), so each of my thread should have gotten its time-slice in a round-robin fashion. What am I missing here and why is this discrepancy?

Edit1: Machine configuration: Intel i7 Quad Core 3.4GHz with hyper-threading turned on (8 active threads at a time). Running Windows-7 64 bit professional (and the test application is 32 bit)

Edit2 (thread code): The test application is built with optimization turned on and without any debug info. Run the test application outside of IDE.

type

  TMyThread = class(TThread)
  protected
    FCount: Int64;
  public
    constructor Create;
    procedure Execute; override;
    property Count: Int64 read FCount;
  end;


{ TMyThread }

constructor TMyThread.Create;
begin
  inherited Create(True);
  FCount := 0;
end;  

procedure TMyThread.Execute;
begin
  inherited;
  while not Terminated do
  begin
    Inc(FCount);
  end;
end;

789

asked Aug 13 '12 21:08

ssh

2 Answers

Round-robin scheduling is an obvious strategy for a kernel. That's however not the way that the Windows scheduler works. It used to, back in the Windows 9x days, a scheduler which was very capable of giving various VMs equal time. But not in the NT branch, started by Dave Cutler's group, scheduling is purely based on priority.

Whatever thread has the highest priority gets the cpu. There's another chunk of code in Windows that tinkers with a thread's priority, modifying it from the default priority it gets when the thread got created. That code is aware of stuff like a thread owning a window that's in the foreground. Or a thread that's waiting for a synchronization object that got signaled. Or the more bizarre scheduling problems that tries to solve a priority inversion problem. Randomly giving a thread a chance to run even though it wasn't its turn.

Focus on writing sane code first. Starting a hundred threads isn't a very sane thing to do. You are trying to consume resources that the machine doesn't actually have available, nobody has a machine with a hundred cores. Yet. Powers of two, get a machine with 128 cores first.

164

answered Sep 29 '22 10:09

Hans Passant

I have reproduced and confirm your results. Additionally, disabling thread priority boost doesn't change the distribution. GetThreadTimes reports that threads with higher Values took more UserTime and vice versa, while KernelTime seems to have no correlation with Values.

Thread 97: 1081,5928 Ke:0 Us:25116161
Thread 98: 1153,8029 Ke:0 Us:26988173
Thread 99: 704,6996  Ke:0 Us:16848108

Clearly, some threads really get to run more often than others.

I haven't graphed the results, but I suppose what we're seeing is a Normal distribution, which means the results depend on a number of factors, some which are random.

I tried disabling hyper-threading (this kinda smoothed the results), then assigning each thread a single physical processor (by using SetThreadAffinityMask). In the second case, Values were much closer to each other.

SetThreadAffinityMask(Self.Handle, 1 shl (FIndex mod 4));

I can sort of understand how running on a hyper-threaded system can make some threads "unlucky": they are scheduled to compete with other threads on the same physical processor, and because of "soft affinity" to this virtual core they get to run on it again and again, thus scoring lower than others.

But as to why binding each thread to a fixed core helps on a non-hyperthreaded system, I don't know.

There are probably other random things involved, such as the activity on the cores by other processes. Thread can get "unlucky" if some other process' thread associated with the same core suddenly wakes up and starts doing some (relatively) heavy work.

All of this is guessing though.

answered Sep 29 '22 11:09

himself

Related questions
                            
                                How does the cache coherency protocol enforce atomicity?
                            
                                I need the server to send messages to all clients (Python, sockets)
                            
                                Is AddOrUpdate thread safe in ConcurrentDictionary?
                            
                                Java - Priority in semaphore
                            
                                Does Python asyncio use a thread pool?
                            
                                Concurrently reading a Map while a single background thread regularly modifies it
                            
                                Why doesn't await Task.Run() sync back to UI Thread / origin context?
                            
                                Spurious unblocking in boost thread
                            
                                Operations from atomic.h seem to be non-atomic
                            
                                What's the life-time of a thread-local value in Python?
                            
                                Non-blocking I/O versus using threads (How bad is context switching?)
                            
                                Is this a valid pattern for raising events in C#?
                            
                                Lock dictionary within same thread
                            
                                Can(should?) Lazy<T> be used as a caching technique?
                            
                                Where to join threads created in an asynchronous tornado request handler?
                            
                                Java: How to check if a lock can be acquired? [duplicate]
                            
                                Usage of the C# lock keyword
                            
                                Waiting on a condition (pthread_cond_wait) and a socket change (select) simultaneously
                            
                                Thread local variables and fs segment
                            
                                Confused about multi-threading in a loop for C# [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why threads starve even on preemptive multitasking OS (Windows 7)

Tags:

multithreading

winapi

multitasking

delphi

delphi-7

ssh

People also ask

2 Answers

Hans Passant

himself

Recent Activity

Donate For Us