Why are 1000 threads faster than a few?

Tags:

I have a simple program that searches linearly in an array of 2D points. I do 1000 searches into an array of 1 000 000 points.

The curious thing is that if I spawn 1000 threads, the program works as fast as when I span only as much as CPU cores I have, or when I use Parallel.For. This is contrary to everything I know about creating threads. Creating and destroying threads is expensive, but obviously not in this case.

Can someone explain why?

Note: this is a methodological example; the search algorithm is deliberately not meant do to optimal. The focus is on threading.

Note 2: I tested on an 4-core i7 and 3-core AMD, the results follow the same pattern!

Click to copy

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;

/// <summary>
/// We search for closest points.
/// For every point in array searchData, we search into inputData for the closest point, 
/// and store it at the same position into array resultData;
/// </summary>
class Program
{
    class Point
    {
        public double X { get; set; }
        public double Y { get; set; }

        public double GetDistanceFrom (Point p)
        {
            double dx, dy;
            dx = p.X - X;
            dy = p.Y - Y;
            return Math.Sqrt(dx * dx + dy * dy);
        }
    }

    const int inputDataSize = 1_000_000;
    static Point[] inputData = new Point[inputDataSize];

    const int searchDataSize = 1000;
    static Point[] searchData = new Point[searchDataSize];
    static Point[] resultData = new Point[searchDataSize];

    static void GenerateRandomData (Point[] array)
    {
        Random rand = new Random();
        for (int i = 0; i < array.Length; i++)
        {
            array[i] = new Point()
            {
                X = rand.NextDouble() * 100_000,
                Y = rand.NextDouble() * 100_000
            };
        }
    }

    private static void SearchOne(int i)
    {
        var searchPoint = searchData[i];
        foreach (var p in inputData)
        {
            if (resultData[i] == null)
            {
                resultData[i] = p;
            }
            else
            {
                double oldDistance = searchPoint.GetDistanceFrom(resultData[i]);
                double newDistance = searchPoint.GetDistanceFrom(p);
                if (newDistance < oldDistance)
                {
                    resultData[i] = p;
                }
            }
        }
    }

    static void AllThreadSearch()
    {
        List<Thread> threads = new List<Thread>();
        for (int i = 0; i < searchDataSize; i++)
        {
            var thread = new Thread(
                obj =>
                {
                    int index = (int)obj;
                    SearchOne(index);
                });
            thread.Start(i);
            threads.Add(thread);
        }
        foreach (var t in threads) t.Join();
    }

    static void FewThreadSearch()
    {
        int threadCount = Environment.ProcessorCount;
        int workSize = searchDataSize / threadCount;
        List<Thread> threads = new List<Thread>();
        for (int i = 0; i < threadCount; i++)
        {
            var thread = new Thread(
                obj =>
                {
                    int[] range = (int[])obj;
                    int from = range[0];
                    int to = range[1];
                    for (int index = from; index < to; index++)
                    {
                        SearchOne(index);
                    }
                }
                );
            int rangeFrom = workSize * i;
            int rangeTo = workSize * (i + 1);
            thread.Start(new int[]{ rangeFrom, rangeTo });
            threads.Add(thread);
        }
        foreach (var t in threads) t.Join();
    }

    static void ParallelThreadSearch()
    {
        System.Threading.Tasks.Parallel.For (0, searchDataSize, 
                index =>
                {
                    SearchOne(index);
                });
    }

    static void Main(string[] args)
    {
        Console.Write("Generatic data...  ");
        GenerateRandomData(inputData);
        GenerateRandomData(searchData);
        Console.WriteLine("Done.");
        Console.WriteLine();

        Stopwatch watch = new Stopwatch();

        Console.Write("All thread searching... ");
        watch.Restart();
        AllThreadSearch();
        watch.Stop();
        Console.WriteLine($"Done in {watch.ElapsedMilliseconds} ms.");

        Console.Write("Few thread searching... ");
        watch.Restart();
        FewThreadSearch();
        watch.Stop();
        Console.WriteLine($"Done in {watch.ElapsedMilliseconds} ms.");

        Console.Write("Parallel thread searching... ");
        watch.Restart();
        ParallelThreadSearch();
        watch.Stop();
        Console.WriteLine($"Done in {watch.ElapsedMilliseconds} ms.");

        Console.WriteLine();
        Console.WriteLine("Press ENTER to quit.");
        Console.ReadLine();
    }
}

EDIT: Please make sure to run the app outside the debugger. VS Debugger slows down the case of multiple threads.

EDIT 2: Some more tests.

To make it clear, here is updated code that guarantees we do have 1000 running at once:

Click to copy

public static void AllThreadSearch()
{
    ManualResetEvent startEvent = new ManualResetEvent(false);
    List<Thread> threads = new List<Thread>();
    for (int i = 0; i < searchDataSize; i++)
    {
        var thread = new Thread(
        obj =>
        {
            startEvent.WaitOne();
            int index = (int)obj;
            SearchOne(index);
        });
        thread.Start(i);
        threads.Add(thread);
    }
    startEvent.Set();
    foreach (var t in threads) t.Join();
}

Testing with a smaller array - 100K elements, the results are:

1000 vs 8 threads

Click to copy

               Method |     Mean |    Error |    StdDev | Scaled |
--------------------- |---------:|---------:|----------:|-------:|
      AllThreadSearch | 323.0 ms | 7.307 ms | 21.546 ms |   1.00 |
      FewThreadSearch | 164.9 ms | 3.311 ms |  5.251 ms |   1.00 |
 ParallelThreadSearch | 141.3 ms | 1.503 ms |  1.406 ms |   1.00 |

Now, 1000 threads is much slower, as expected. Parallel.For still bests them all, which is also logical.

However, growing the array to 500K (i.e. the amount of work every thread does), things start to look weird:

1000 vs 8, 500K

Click to copy

               Method |     Mean |    Error |   StdDev | Scaled |
--------------------- |---------:|---------:|---------:|-------:|
      AllThreadSearch | 890.9 ms | 17.74 ms | 30.61 ms |   1.00 |
      FewThreadSearch | 712.0 ms | 13.97 ms | 20.91 ms |   1.00 |
 ParallelThreadSearch | 714.5 ms | 13.75 ms | 12.19 ms |   1.00 |

Looks like context-switching has negligible costs. Thread-creation costs are also relatively small. The only significant cost of having too many threads is loss of memory (memory addresses). Which, alone, is bad enough.

Now, are thread-creation costs that little indeed? We've been universally told that creating threads is very bad and context-switches are evil.

863

asked Oct 23 '18 09:10

Nick

1 Answers

You may want to consider how the application is accessing memory. In the maximum threads scenario you are effectively accessing memory sequentially, which is efficient from a caching point of view. The approach using a small number of threads is more random, causing cache misses. Depending on the CPU there are performance counters that allow you to measure L1 and L2 cache hits/misses.

132

answered Nov 15 '22 20:11

open-collar

Related questions
                            
                                When are tunneling and bubbling events useful in WPF?
                            
                                Should DTO and Entity both have input validations
                            
                                Return Json from Generic List in Web API
                            
                                Entity Framework Creates New / Duplicate Entries for Associated Objects
                            
                                Cannot change Target Framework on Several Projects From v4.0 to V4.5 or 4.5.1
                            
                                How to fix The model item passed into the dictionary is of type error?
                            
                                How do I encrypt data in Entity Framework Code First?
                            
                                Entity Framework: SqlGeography vs DbGeography
                            
                                How to use IEnumerable<String> in documenting code
                            
                                Unable to deserialize classes with multiple constructors with Json.NET [duplicate]
                            
                                Specified cast is not valid?
                            
                                Cast Entity to Implemented Interface in a Generic Method Using LINQ for Entity Framework
                            
                                Change foreign key constraint naming convention
                            
                                Windows Phone navigation buttons overlap with screen resolution
                            
                                In CQRS pattern, should work go in domain services or command handlers
                            
                                How do I convert encoding of a large file (>1 GB) in size - to Windows 1252 without an out-of-memory exception?
                            
                                Azure B2C. Edit custom attribute using Azure Portal
                            
                                Include filter child collection [duplicate]
                            
                                Invalid Switch syntax builds successfully?
                            
                                .NET Core External Authentication without ASP.NET Identity

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why are 1000 threads faster than a few?

Tags:

c#

.net

multithreading

Nick

People also ask

1 Answers

open-collar

Recent Activity

Donate For Us