Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterating a dictionary in C#

var dict = new Dictionary<int, string>();
for (int i = 0; i < 200000; i++)
    dict[i] = "test " + i;

I iterated this dictionary using the code below:

foreach (var pair in dict)
    Console.WriteLine(pair.Value);

Then, I iterated it using this:

foreach (var key in dict.Keys)
    Console.WriteLine(dict[key]);

And the second iteration took ~3 seconds less. I can get both keys and values via both methods. What I wonder is whether the second approach has a drawback. Since the most rated question that I can find about this doesn't include this way of iterating a dictionary, I wanted to know why no one uses it and how does it work faster.

like image 669
Şafak Gür Avatar asked Jul 17 '12 19:07

Şafak Gür


People also ask

Can you iterate through a dictionary?

Dictionaries are iterable objects, which means you can iterate through them like any other object.

How do I iterate through a dictionary key?

In Python, to iterate the dictionary ( dict ) with a for loop, use keys() , values() , items() methods. You can also get a list of all keys and values in the dictionary with those methods and list() . Use the following dictionary as an example. You can iterate keys by using the dictionary object directly in a for loop.


1 Answers

Your time tests have some fundamental flaws:

  • Console.Writeline is an I/O operation, which takes orders of magnitude more time than memory accesses and CPU calculations. Any difference in iteration times is probably being dwarfed by the cost of this operation. It's like measuring the weights of pennies in a cast-iron stove.
  • You don't mention how long the overall operation took, so saying that one took 3 seconds less than another is meaningless. If it took 300 seconds to run the first, and 303 seconds to run the second, then you are micro-optimizing.
  • You don't mention how you measured running time. Did running time include the time getting the program assembly loaded and bootstrapped?
  • You don't mention repeatability: Did you run these operations several times? Several hundred times? In different orders?

Here are my tests. Note how I try my best to ensure that the method of iteration is the only thing that changes, and I include a control to see how much of the time is taken up purely because of a for loop and assignment:

void Main()
{
    // Insert code here to set up your test: anything that you don't want to include as
    // part of the timed tests.
    var dict = new Dictionary<int, string>();
    for (int i = 0; i < 2000; i++)
        dict[i] = "test " + i;
    string s = null;
    var actions = new[]
    {
        new TimedAction("control", () => 
        {
    for (int i = 0; i < 2000; i++)
            s = "hi";
        }),
        new TimedAction("first", () => 
        {
            foreach (var pair in dict)
            s = pair.Value;
        }),
        new TimedAction("second", () => 
        {
            foreach (var key in dict.Keys)
            s = dict[key];
        })
    };
    TimeActions(100, // change this number as desired.
        actions);
}


#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
    Stopwatch s = new Stopwatch();
    foreach(var action in actions)
    {
        var milliseconds = s.Time(action.Action, iterations);
        Console.WriteLine("{0}: {1}ms ", action.Message, milliseconds);
    }

}

public class TimedAction
{
    public TimedAction(string message, Action action)
    {
        Message = message;
        Action = action;
    }
    public string Message {get;private set;}
    public Action Action {get;private set;}
}

public static class StopwatchExtensions
{
    public static double Time(this Stopwatch sw, Action action, int iterations)
    {
        sw.Restart(); 
        for (int i = 0; i < iterations; i++)
        {
            action();
        }
        sw.Stop();

        return sw.Elapsed.TotalMilliseconds;
    }
}
#endregion

Result

control: 1.2173ms
first: 9.0233ms
second: 18.1301ms

So in these tests, using the indexer takes roughly twice as long as iterating key-value pairs, which is what I would expect*. This stays roughly proportionate if I increase the number of entries and the number of repetitions by an order of magnitude, and I get the same results if I run the two tests in reverse order.

* Why would I expect this result? The Dictionary class probably represents its entries as KeyValuePairs internally, so all it really has to do when you iterate it directly is walk through its data structure once, handing the caller each entry as it comes to it. If you iterate just the Keys, it still has to find each KeyValuePair, and give you the value of the Key property from it, so that step alone is going to cost roughly the same amount as iterating across it in the first place. Then you have to call the indexer, which has to calculate a hash for provided key, jump to the correct hashtable bucket, and do an equality check on the keys of any KeyValuePairs it finds there. These operations aren't terribly expensive, but once you do them N times, it's roughly as expensive as if you'd iterated over the internal hashtable structure again.

like image 69
StriplingWarrior Avatar answered Oct 06 '22 11:10

StriplingWarrior