What is it about CMOV which improves CPU pipeline performance?

Tags:

I understand when a branch is easily predicted its better to use an IF statement because the branch is totally free. I have learnt that if the branch isn't easily predicted, then a CMOV is better. However, I do not quite understand how this could be achieved?

Surely the problem domain is still the same- we do not know the address of the next instruction to execute? So I don't understand how all the way down the pipeline, when the CMOV is executed, how that could have helped the instruction fetcher (10 CPU cycles back in the past) choose the correct path and prevent a pipeline stall?

Could somebody please help me understand how CMOV improves branching?

769

asked Nov 25 '14 21:11

user997112

1 Answers

CMOV instructions don't direct the path of control flow. They are instructions that are executed to compute the result based on condition codes, i.e. predicated instructions. Some architectures (like ARM) can predicate many forms of instructions based on condition codes, but x86 can only do "mov", that is, the conditional move (CMOV). These are decoded, and executed with latency in order to determine the result of the instruction.

Branches, on the other hand, are predicted and actually steer the execution of instructions. The branch predictor "looks ahead" of the instruction "fetcher", specifically looking for branch instructions, and predicts the path by steering the flow. Think of a railroad track where a person ahead shifts the tracks either left or right to tell the train where to go. Now if the guy chose the wrong direction, the train has to stop, backup, then move again in the right direction. Lots of time wasted.

CMOVs, on the other hand, don't steer the flow. They are simply instructions take extra time (and create additional dependencies) to figure out the proper result of the move based on condition codes. Think of the train, instead of decided to go left or right, takes a straight path that requires no turn, but is a bit slower (obviously way more complicated, but it's the best I can come up with right now).

CMOVs used to be really bad (very high latency) but have since improved to be pretty fast, making them a lot more usable and performance-worthy.

Hope this helps..

answered Sep 27 '22 17:09

drivingon9

Related questions
                            
                                Do terminal operations on streams close the source? [duplicate]
                            
                                correct way to encode/embed version number in program code
                            
                                NSURLConnection : JSON text did not start with array or object and option to allow fragments not set
                            
                                Testing Java code with Groovy under Intellij: unable to resolve class GroovyTestCase
                            
                                How to transfer hex strings to []byte directly in Go?
                            
                                Warning that pattern guard is non-exhaustive even though it is
                            
                                Writing CRUDRepository's findBy() method on a field annotated by JoinColumn
                            
                                JMockit Expectation API : How to throw an exception upon method/constructor invocation
                            
                                Java 8u40 Math.round() very slow
                            
                                How to make a production ready build using Ember CLI?
                            
                                Unity IoC does not inject dependency into Web API Controller
                            
                                How to use async within a lambda which returns a collection

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With