Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does eight processes with 2 threads each create more load than one process with 16 threads?

I have a simple program which starts n threads and create some load on each thread. If i only start one thread, one core gets about 100% load. If i start one process with 16 threads(which means one thread per core), i only get about 80% load. If i start 8 processes with 2 threads(which still means one thread per core), i get about 99% load. I don't use any locking in this sample.

What is the reason for this behavior? I understand that the load goes down if there a 100 threads working because the OS has to schedule a lot. But in this case there are only as many threads as cores.

It is even worse(for me at least). If i add a simple thread.sleep(0) in my loop, the load with one process and 16 threads increase up to 95%.

Can anyone answer this, or provide a link with more information about this specific topic?

One Process 16 threads

Eight Process 2 threads

One Process 16 threads with thread.sleep(0)

//Sample application which reads the number of threads to be started from Console.ReadLine
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Enter the number of threads to be started");
            int numberOfThreadsToStart;

            string input = Console.ReadLine();

            int.TryParse(input, out numberOfThreadsToStart);
            if(numberOfThreadsToStart < 1)
            {
                Console.WriteLine("No valid number of threads entered. Exit now");
                Thread.Sleep(1500);
                return;
            }

            List<Thread> threadList = new List<Thread>();
            Stopwatch sw = Stopwatch.StartNew();
            for (int i = 0; i < numberOfThreadsToStart; i++)
            {
                Thread workerThread = new Thread(MakeSomeLoad);
                workerThread.Start();
                threadList.Add(workerThread);
            }

            while (true)
            {
                Console.WriteLine("I'm spinning... ");
                Thread.Sleep(2000);
            }
        }

        static void MakeSomeLoad()
        {
            for (int i = 0; i < 100000000; i++)
            {

                for (int j = 0; j < i; j++)
                {
                    //uncomment the following line to increase the load
                    //Thread.Sleep(0);
                    StringBuilder sb = new StringBuilder();
                    sb.Append("hello world" + j);
                }
            }
        }
    }
like image 226
Manuel Avatar asked Dec 17 '22 03:12

Manuel


2 Answers

Your test looks very GC heavy. If you have 16 threads in one process, the GC will run more in that process, and since the client GC isn't parallel, this leads to a lower load. i.e. you have 16 garbage producing threads per GC thread.

On the other hand if you run 8 processes with two threads each, you get only two threads producing garbage for each GC thread, and the GC can work in parallel between those processes.

If you write a test that produces less garbage, and uses more CPU directly, you will likely get different results.

(Note that this is only speculation, I didn't run your test, and since I only have a dual core CPU that would be different from your results anyways)

like image 182
CodesInChaos Avatar answered Dec 18 '22 16:12

CodesInChaos


Something else to consider is that there are different modes to the garbage collector:

  • Server GC
  • Workstation GC - Concurrent (default execept for asp.net)
  • Workstation GC – Non Concurrent

You can find some of the graphic details of each here.

Since you process is using lots of threads and is allocating a whole lot of memory, you should try server GC.

The server GC is optimized for high throughput and high scalability in server applications where there is a consistent load and requests are allocating and deallocating memory at a high rate. The server GC uses one heap and one GC thread per processor and tries to balance the heaps as much as possible. At the time of a garbage collection, the GC threads work on their respective threads and rendez-vous at certain points. Since they all work on their own heaps, minimal locking etc. is needed which makes it very efficient in this type of situation.

You enable the Server CG in your App.config:

<configuration>
 <runtime>
   <gcServer enabled="true" />
 </runtime>
</configuration> 

Note that this will only work on a multi processor (or core) system. If windows reports only one processor then you will get Workstation GC – Non Concurrent instead.

like image 42
user957902 Avatar answered Dec 18 '22 16:12

user957902