Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::pair<int, int> vs struct with two int's

In an ACM example, I had to build a big table for dynamic programming. I had to store two integers in each cell, so I decided to go for a std::pair<int, int>. However, allocating a huge array of them took 1.5 seconds:

std::pair<int, int> table[1001][1001]; 

Afterwards, I have changed this code to

struct Cell {     int first;     int second; }  Cell table[1001][1001]; 

and the allocation took 0 seconds.

What explains this huge difference in time?

like image 773
Etan Avatar asked Oct 22 '09 12:10

Etan


People also ask

Is std :: pair a struct?

Essentially std::pair is still a struct.

Is std :: pair contiguous?

std::pair is a struct, the standard says the compiler determines the layout though the order must be maintained, so in the instance of std::pair<char,char> your compiler may decide to place 3-byte padding after each char for optimal alignment, so no you can't assume contiguous memory layout - end of story.

How do you compare pairs in C++?

With C++ pair, use the comparison (==) operator: The comparison operator measures the first and second values of two sets, say pair1 and pair2, to see whether pair1. first is equal to pair2. first or not, and whether pair1. second is equal to pair2.

What is std :: pair?

std::pair is a class template that provides a way to store two heterogeneous objects as a single unit. A pair is a specific case of a std::tuple with two elements.


2 Answers

std::pair<int, int>::pair() constructor initializes the fields with default values (zero in case of int) and your struct Cell doesn't (since you only have an auto-generated default constructor that does nothing).

Initializing requires writing to each field which requires a whole lot of memory accesses that are relatively time consuming. With struct Cell nothing is done instead and doing nothing is a bit faster.

like image 95
sharptooth Avatar answered Sep 21 '22 04:09

sharptooth


The answers so far don't explain the full magnitude of the problem.

As sharptooth has pointed out, the pair solution initializes the values to zero. As Lemurik pointed out, the pair solution isn't just initializing a contiguous block of memory, instead it is calling the pair constructor for every element in the table. However, even that doesn't account for it taking 1.5 seconds. Something else is happening.

Here's my logic:

Assuming you were on an ancient machine, say running at 1.33ghz, then 1.5 seconds is 2e9 clock cycles. You've got 2e6 pairs to construct, so somehow each pair constructor is taking 1000 cycles. It doesn't take 1000 cycles to call a constructor that just sets two integers to zero. I can't see how cache misses would make it take that long. I would believe it if the number was less than 100 cycles.

I thought it would be interesting to see where else all these CPU cycles are going. I used the crappiest oldest C++ compiler I could find to see if I could attain the level of wastage required. That compiler was VC++ v6. In debug mode, it does something I don't understand. It has a big loop that calls the pair constructor for each item in the table - fair enough. That constructor sets the two values to zero - fair enough. But just before doing that, it sets all the bytes in a 68 byte region to 0xcc. That region is just before the start of the the big table. It then overwrites the last element of that region with 0x28F61200. Every call of the pair constructor repeats this. Presumably this is some kind of book keeping by the compiler so it knows which regions are initialized when checking for pointer errors at run time. I'd love to know exactly what this is for.

Anyway, that would explain where the extra time is going. Obviously another compiler may not be this bad. And certainly an optimized release build wouldn't be.

like image 30
Andrew Bainbridge Avatar answered Sep 20 '22 04:09

Andrew Bainbridge