In the Linux source tree the file arch/x86/boot/header.S has x86 code similar to this to clear the BSS section prior to main
being called:
...
# Zero the bss
movw $__bss_start, %di
movw $_end+3, %cx
xorl %eax, %eax
subw %di, %cx
shrw $2, %cx
rep; stosl
...
Why does the _end
address have 3 added to it? Why not movw $_end, %cx
instead of movw $_end+3, %cx
?
Had the code been clearing the BSS section byte by byte movw $_end, %cx
would have sufficed. However, this code doesn't zero out the BSS with STOSB they use STOSL. It is generally more efficient to store 32 bits at a time rather than 8 bits.
STOSL will store EAX (which is set to zero with xorl %eax, %eax
) enough times to clear the entire range of BSS to 0. The +3 insures that if the length of the BSS section ($_end-$__bss_start) is not evenly divisible by 4, that computing the number of DWORDs needed to clear will be rounded up. If this rounding up doesn't occur then in instances where the size isn't evenly divisible by 4, the last bytes may not be cleared.
The calculation being done here assumes __bss_start
is a pointer to the beginning of BSS segment and that _end
is a pointer to the end of BSS. The equation to compute the number of 32-bit DWORDs to clear is effectively:
NUMDWORDS=(_end+3-__bss_start) >> 2
The shrw $2, %cx
(>>2
in the calculation) is integer division by 4 where the result is always rounded down. We add +3 to the number of bytes so that when the division by 4 is done it effectively rounds up to the nearest number of DWORDs. This value is then used as the number of DWORDs STOSL will be setting to zero.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With