Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are assembler section directives used for?

I'm trying to learn ARM assembly.

After writing this small Hello World program:

                .global _start

                .text
_start:         ldr     R1,=msgtxt      
                mov     R2,#13          
                mov     R0,#1           
                mov     R7,#4           
                svc     0               

                mov     R7,#1           
                svc     0               


                .data
msgtxt:         .ascii  "Hello World!\n"

                .end

I noticed I could remove the .text and .data directive, the program would work just as well.

I'm therefore curious : everything I read emphasized the fact that .text section is to be used for code and .data for data. But here, before my eyes, they seem to do nothing at all!

Therefore, if these are not used to hold code and data respectively, what is their true purpose?

like image 980
Ykon O'Clast Avatar asked Jan 27 '23 14:01

Ykon O'Clast


1 Answers

Those sorts of directives depend on what architecture you're building your program for, and they choose what memory section to assign to whatever code or data that follows. In the end, everything is just a string of bytes. After your program is assembled, the symbols/labels will be assigned different memory addresses according to what section they're in.

.text is generally allocated in a read-only memory section, most-suitable for code that isn't expected to change.

.data is typically a writable section of memory. I believe that it's quite common to put your string in .text right next to your code data if it isn't expected to change (or maybe the architecture has a similar read-only segment). I would say that the .data section is even avoided most of the time. Why? Because the .data section needs to be initialized—copied from the program binary into memory when the program starts. Most data that your program references can be read-only, and any memory that they need for operations is usually just allocated with the .bss segment, which allocates a section of uninitialized memory.

There are some advantages of mixing code and data in the same section, such as easy access to the address of the data with a relative offset from the PC register (address of the code being executed). Then of course there are the disadvantages, in that if you try to modify read-only memory, you'll end up with at the very least your actions ignored, and the program might trigger an exception and crash. All very architecture-specific, and the safest bet is to keep code in segments meant for code, and data/allocations in segments meant for data.

It's all very specific to what your program is targeting. For example, the Game Boy Advance had a 256KB "slow" memory region, a 32KB "fast" memory region, and then the read-only "ROM" region (the game cartridge data) which can be several megabytes, and assemblers used these memory sections:

.data or .iwram  -> Internal RAM (32KB)
.bss             -> Internal RAM uninitialized
.ewram           -> External RAM (256KB)
.sbss            -> External RAM uninitialized
.text or .rodata -> Read only ROM (cartridge size)

To give another example, the SPC-700 (SNES sound chip) had 64KB of readable and writable memory that was used for everything, but the first 256 bytes of it had faster access (the "zero page"). In this theoretical case, .data and .text would be assigned to the same memory region--that is, they would not be allocated in the zero-page, and they both share the same memory. There would be a custom segment for the zero-page, and the difference between .text and .data would be very little - just a way to distinguish which symbols in the assembled program point to "data" and which symbols point to program code.

like image 69
mukunda Avatar answered Feb 01 '23 09:02

mukunda