Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random memory accesses are expensive?

During optimizing my connect four game engine I reached a point where further improvements only can be minimal because much of the CPU-time is used by the instruction TableEntry te = mTable[idx + i] in the following code sample.

TableEntry getTableEntry(unsigned __int64 lock)
{
    int idx = (lock & 0xFFFFF) * BUCKETSIZE;
    for (int i = 0; i < BUCKETSIZE; i++)
    {
        TableEntry te = mTable[idx + i]; // bottleneck, about 35% of CPU usage
        if (te.height == NOTSET || lock == te.lock)
            return te;
    }
    return TableEntry();
}

The hash table mTable is defined as std::vector<TableEntry> and has about 4.2 mil. entrys (about 64 MB). I have tried to replace the vectorby allocating the table with new without speed improvement.

I suspect that accessing the memory randomly (because of the Zobrist Hashing function) could be expensive, but really that much? Do you have suggestions to improve the function?

Thank you!

Edit: BUCKETSIZE has a value of 4. It's used as collision strategy. The size of one TableEntry is 16 Bytes, the struct looks like following:

struct TableEntry
{                                       // Old New
    unsigned __int64 lock;              //   8   8
    enum { VALID, UBOUND, LBOUND }flag; //   4   4
    short score;                        //   4   2
    char move;                          //   4   1
    char height;                        //   4   1
                                        // -------
                                        //  24  16 Bytes
    TableEntry() : lock(0LL), flag(VALID), score(0), move(0), height(-127) {}
};

Summary: The function originally needed 39 seconds. After making the changes jdehaan suggested, the function now needs 33 seconds (the program stops after 100 seconds). It's better but I think Konrad Rudolph is right and the main reason why it's that slow are the cache misses.

like image 918
Christian Ammer Avatar asked Feb 16 '11 21:02

Christian Ammer


1 Answers

You are making copies of your table entry, what about using TableEntry& as a type. For the default value at the bottom a static default TableEntry() will also do. I suppose that is where you lose much time.

const TableEntry& getTableEntry(unsigned __int64 lock)
{
    int idx = (lock & 0xFFFFF) * BUCKETSIZE;
    for (int i = 0; i < BUCKETSIZE; i++)
    {
        // hopefuly now less than 35% of CPU usage :-)
        const TableEntry& te = mTable[idx + i];
        if (te.height == NOTSET || lock == te.lock)
            return te;
    }
    return DEFAULT_TABLE_ENTRY;
}
like image 148
jdehaan Avatar answered Oct 08 '22 18:10

jdehaan