The Visual Studio C++ compiler option /O2 (maximize speed) is equivalent to
/Og /Oi /Ot /Oy /Ob2 /Gs /GF /Gy
Why /Gs? How does it help maximize speed? (Note that it is /Gs, not /GS.)
/GL (Whole program optimization)
In Visual Studio You can set compiler options for each project in its Visual Studio Property Pages dialog box. In the left pane, select Configuration Properties, C/C++ and then choose the compiler option category.
Compiler options can adjust the size and alignment of supplied stacks, and optimizations can seriously change how the stack is created and accessed.
/Gs will insert stack probes in functions that use more than one page (4kB typically) of local variables. A stack probe signals to the OS that you'll use a lot of stack space. If this probe hits the guard page, the OS knows that it will need to allocate an extra page of RAM for the stack to grow.
This is an optimization, because without the probe the actual memory access would trigger the RAM allocation, and the function would stall until the RAM is allocated. The probe, as part of the function prolog, doesn't cause a stall (at least not as severe).
[edit] Another benefit is that a stack probe up front will allocate memory once. If you need 16 KB of stack space and rely on allocation-on-demand, you'll have 4 page faults that each grow the stack by 4 KB. A single probe can replace these 4 faults with one syscall.
/O2 doesn't set /Gs, it's an error in the documentation.
Some experimentation (it's easy to see the __chkstk calls in the generated assembly) shows that:
/Gs (with no number) is equivalent to /Gs0 and means always insert __chkstk calls. And indeed, the MSDN says the same:
If the /Gs option is specified without a size argument, it is the same as specifying /Gs0,
/O2 does not set /Gs (aka /Gs0), there's a clear difference between "/O2" and "/O2 /Gs". Although it's possible it changes the default to something other than the page it seems more likely that this is just a documentation bug.
Stack probes are never good for performance, it only has a job to do when the stack is advancing to a new high water mark and is wasted cycles the rest of the time. This means that if you have a loop that calls a function 100 times, that functions stack probe might grow the stack the first time, but other 99 times it doesn't change anything because the stack was already grown the first time - if it needed to be grown at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With