Each time I read about inline
keyword in C++ there's a long explanation that the compiler makes a "speed versus code volume" analysis and then decided whether to inline a function call in each specific case.
Now Visual C++ 9 has a __forceinline
keyword that seems to make the compiler inline the call to the function unless such inlining is absolutely impossible (like a call is virtual).
Suppose I look through some project without understanding what goes inside it and decide myself that one third of functions are small enough and good for inlining and mark them with __forceinline
and the compiler does inline them and now the executable has become say one hundred times bigger.
Will it really matter? What effect should I expect from having functions inlined overly aggressively and having one hundred times bigger executable?
The main impact will be to the cache. Inlining goes against the principal of locality; the CPU will have to fetch the instructions from the main memory far more often. So what was intended to make the code faster may actually make it slower.
Others have already mentioned the impact on cache. There's another penalty to pay. Modern CPU's are quite fast, but at a price. They have deep pipelines of instructions being processed. To keep these pipelines filled even in the presence of conditional branches, fast CPUs use branch prediction. They record how often a branch was taken and use that to predict whether a branch will be taken in the future.
Obviously, this history takes memory, and it's a fixed size table. It contains only a limited number of branch instructions. By increasing the number of instructions a hundredfold, you also increase the number of branches by that much. This means the number of branches with predictions decreases sharply. In addition, for the branches that are present in the prediction table, less data is available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With