Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't we declare uninitialized variables in .bss section using `?` in arbitrary order?

Tags:

x86

assembly

nasm

This code works as intended:

section .bss
    var2:      DB     ?     
    X:         DW     ?     ; works

With the reservations in opposite order, the code doesn't assemble:

section .bss
    X:         DW     ?     
    var2:      DB     ?    ; error with lines in other order

I got this error, even though I don't use this label, var2 on another part of the program (in fact, this is reproducible assembling just that code block as a 3-line file).

error: label `var2' changed during code generation [-w+error=label-redef-late]

I think the var2 variable is overwritten by the X variable because it's a word, on 2 bytes.

I am using NASM, version 2.15.04 to assemble this code (also reproducible with 2.15.05).

like image 270
Mocanu Gabriel Avatar asked Nov 17 '21 22:11

Mocanu Gabriel


People also ask

Is BSS initialized to zero?

On some platforms, some or all of the bss section is initialized to zeroes. Unix-like systems and Windows initialize the bss section to zero, allowing C and C++ statically allocated variables initialized to values represented with all bits zero to be put in the bss segment.

Why do we need BSS segment?

bss provides a nice historical explanation, given that the term is from the mid-1950's (yippee my birthday;-). Back in the day, every bit was precious, so any method for signalling reserved empty space, was useful. This (. bss) is the one that has stuck.

Why is the bss section uninitialized in C++?

The BSS section is not uninitialized; it’s guaranteed to be initialized to zero. It’s separated from the data section so that it can be initialized at load time, which means the zeros do not need to take up space in the executable file. The BSS section isn’t strictly necessary.

What is the difference between BSS and data section?

The entire .bss segment is described by a single number, probably 4 bytes or 8 bytes, that gives its size in the running process, whereas the .data section is as big as the sum of sizes of the initialized variables. Thus, the .bss makes the executables smaller and quicker to load.

Why do we initialize the bss segment to zero?

On those systems where the bss segment is initialized to zero, putting common block variables and other static data into that segment guarantees that it will be zero, but for portability, programmers should not depend on that. ^ McKusick, Marshall Kirk; Karels, Michael J. (1986).

What happens if you use an uninitialized variable?

Using uninitialized variables is one of the most common mistakes that novice programmers make, and unfortunately, it can also be one of the most challenging to debug (because the program may run fine anyway if the uninitialized variable happened to get assigned to a spot of memory that had a reasonable value in it, like 0).


1 Answers

Update: my patch was merged and the issue should no longer be present from NASM 2.15.06.


After some debugging and poking around in the source code I can confirm my initial suspicion that this is a bug.

The size calculation for instructions of the form Dx ? (i.e. any Dx which includes a uninitialized storage token ?) where Dx is larger than DB internally returns the wrong size (assuming elements of 1 byte instead of the appropriate element size). This has the side effect of inconsistently altering the segment offset of any label following the instruction, causing a mismatch in the final code generation stage which is caught by a couple of checks and makes NASM error out.

Normally I would've simply reported the bug, but since NASM's GitHub repo does not have an "Issues" page active and their Bugzilla currently disallows registration I went ahead and submitted a pull request. The fix seems quite simple, unless there's something that I'm missing, in which case we'll find out (hopefully) soon.

like image 85
Marco Bonelli Avatar answered Oct 18 '22 04:10

Marco Bonelli