Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest x86 assembly code to synchronize access to an array? [closed]

What is the fastest x86 assembly code to synchronize access to an array in memory?

To be more precise: We have a malloc'ed continuous single-paged region in memory and the OS will not page-out this region for the duration of our experiment. One thread will write to the array, one thread will read from the array. the array is small, but larger than the atomic-write capability of your cpu (so that a separate lock is acutally required)

"fastest": the effective speed: Do not just assume the length of bytecode is significant but take into account the caching behavior of the lock and branching behavior regarding surrounding code.

It has to work on x86-32 and/or x86-64

It has to work on-top of (or descendents of) Windows since XP, Linux since kernel 2.2, or MaxOs X (in user-mode).

Please no "it depends"-responses: If it depends on anything I have not specified here just make up your own example(s) and state what is fastest in that/those case(s).

Post code! (This is to prevent vague descriptions)

Post not only your 2-line LOCK + CMPXCHG compare&swap but show us how you integrate it with the read instructions in the one thread and the write-instructions in the other.

If you like, explain your tweaks for cache-optimality and how to avoid branch-mispredictions if the branch-target is dependant on (1) whether you get the lock or not (2) what the first byte of a larger-read is.

If you like distinguish between multiprocessing and task-switching: how will your code perform if the threads are not performed on 2 cpus but just get hold of one?

like image 700
Bernd Elkemann Avatar asked Mar 12 '11 16:03

Bernd Elkemann


People also ask

What does MOVQ %RSP %RBP do?

The second instruction is movq %rsp, %rbp . This saves the current stack pointer in %rbp (so %rbp = entry %rsp - 8). This adjusted value of %rbp is the callee's “frame pointer.” The callee will not change this value until it returns.

What does MOV mean in assembly?

mov — Move (Opcodes: 88, 89, 8A, 8B, 8C, 8E, ...) The mov instruction copies the data item referred to by its second operand (i.e. register contents, memory contents, or a constant value) into the location referred to by its first operand (i.e. a register or memory).

What is %rip in assembly?

The %rip register on x86-64 is a special-purpose register that always holds the memory address of the next instruction to execute in the program's code segment.

What is push in assembly?

"push" stores a constant or 64-bit register out onto the stack. The 64-bit registers are the ones like "rax" or "r8", not the 32-bit registers like "eax" or "r8d". ("push eax" gives an error "instruction not supported in 64-bit mode"; use "push rax" instead.) "pop" retrieves the last value pushed from the stack.


1 Answers

Really, the answer is "it depends". What's the usage pattern of your array? Is it read-mostly? Is it update-mostly and you can get away with imprecise results on reading (using per-cpu arrays)? Updates are so infrequent that RCU would give serious performance improvements?

There are lots of tradeoffs here, see Paul McKenney's book: Is Parallel Programming Hard, And, If So, What Can You Do About It?

like image 158
ninjalj Avatar answered Sep 21 '22 22:09

ninjalj