I'm very curious how assembly languages work- I remain general because I'm not talking only about intel x86 assembly (although it's the only one I'm remotely familiar with). To be a bit more clear... <pre class="prettyprint"><code>mov %eax,%ebx </code></pre> How does the computer know what an instruction like "mov" does? How does it know that eax and ebx are registers? Do people write grammars for assembly languages? How do they write this? I imagine nothing is stopping someone from writing an assembly language that substitutes the <code>mov</code> instruction with something like <code>dog</code> or <code>horse</code> etc., (obviously this isn't semantic at all) Sorry if this isn't too clear, but it's something I find a bit puzzling- I know it can't be magic, but I can't see how it works. I've looked up some stuff on wikipedia, but all it seems to say is it translates it down to machine code, well, what I'm asking is how that translation occurs I suppose. Thoughts? EDIT: I realize that this stuff is defined in reference manuals and things, I guess what I wish to know is how you tell your processor "Okay, when you see <code>mov</code> you're gonna do this". I also know that it's a sequence of probably a ton of logic gates..but there has to be some way for the processor to recognize is that <code>mov</code> is the symbol that means "use these logic gates"

Your CPU doesn’t execute assembly. The assembler converts it into machine code. This process depends on both the particular assembly language and the target computer architecture. Generally those go hand in hand, but you might find different flavors of assembly language (nasm vs. AT&T, for example), which all translate into similar machine code. A typical (MIPS) assembly instruction such as “And immediate” <pre class="prettyprint"><code>andi $t, $s, imm </code></pre> would become the 32-bit machine code word <pre class="prettyprint"><code>0011 00ss ssst tttt iiii iiii iiii iiii </code></pre> where <code>s</code> and <code>t</code> are numbers from 0–31 which name registers, and <code>i</code> is a 16-bit value. It’s this bit pattern that the CPU actually executes. The <code>001100</code> in the beginning is the opcode corresponding to the <code>andi</code> instruction, and the bit pattern that follows — 5-bit source register, 5-bit target register, 16-bit literal — varies depending on the instruction. When this instruction is placed into the CPU, it responds appropriately by decoding the opcode, selecting the registers to be read and written, and configuring the ALU to perform the necessary arithmetic.

How do assembly languages work?

Tags:

assembly

grammar

hardware

I'm very curious how assembly languages work- I remain general because I'm not talking only about intel x86 assembly (although it's the only one I'm remotely familiar with). To be a bit more clear...

mov %eax,%ebx

How does the computer know what an instruction like "mov" does? How does it know that eax and ebx are registers? Do people write grammars for assembly languages? How do they write this? I imagine nothing is stopping someone from writing an assembly language that substitutes the mov instruction with something like dog or horse etc., (obviously this isn't semantic at all)

Sorry if this isn't too clear, but it's something I find a bit puzzling- I know it can't be magic, but I can't see how it works. I've looked up some stuff on wikipedia, but all it seems to say is it translates it down to machine code, well, what I'm asking is how that translation occurs I suppose.

Thoughts?

EDIT: I realize that this stuff is defined in reference manuals and things, I guess what I wish to know is how you tell your processor "Okay, when you see mov you're gonna do this". I also know that it's a sequence of probably a ton of logic gates..but there has to be some way for the processor to recognize is that mov is the symbol that means "use these logic gates"

799

asked Jun 24 '11 05:06

LainIwakura

1 Answers

Your CPU doesn’t execute assembly. The assembler converts it into machine code. This process depends on both the particular assembly language and the target computer architecture. Generally those go hand in hand, but you might find different flavors of assembly language (nasm vs. AT&T, for example), which all translate into similar machine code.

A typical (MIPS) assembly instruction such as “And immediate”

andi $t, $s, imm

would become the 32-bit machine code word

0011 00ss ssst tttt iiii iiii iiii iiii

where s and t are numbers from 0–31 which name registers, and i is a 16-bit value. It’s this bit pattern that the CPU actually executes. The 001100 in the beginning is the opcode corresponding to the andi instruction, and the bit pattern that follows — 5-bit source register, 5-bit target register, 16-bit literal — varies depending on the instruction. When this instruction is placed into the CPU, it responds appropriately by decoding the opcode, selecting the registers to be read and written, and configuring the ALU to perform the necessary arithmetic.

answered Oct 25 '22 11:10

Josh Lee

Related questions
                            
                                trouble understanding assembly command "load effective address" [duplicate]
                            
                                How to tell GCC to generate 16-bit code for real mode
                            
                                point of IT instruction ARM assembly
                            
                                Does int 80h interrupt a kernel process?
                            
                                Text Editor For Assembly
                            
                                Injecting code into executable at runtime
                            
                                Confusion about bsr and lzcnt
                            
                                Tracing/profiling instructions
                            
                                x86 NASM 'org' directive meaning
                            
                                Assembly code for sin(x) using Taylor expansion
                            
                                Why can assembly instructions contain multiplications in the "lea" instruction?
                            
                                How to write data to a graphics card without using BIOS?
                            
                                ASM x86_64 AVX: xmm and ymm registers differences
                            
                                Does an x86 CPU reorder instructions?
                            
                                Is it valid to write below ESP?
                            
                                The new line characted in the string constant isn't being recognized by nasm
                            
                                What is the jmpq command doing in this example
                            
                                What are the segment and offset in real mode memory addressing?
                            
                                How do vararg functions find out the number of arguments in machine code?
                            
                                Anyone knows what "mov edi,edi " does?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With