Difference between: Opcode, byte code, mnemonics, machine code and assembly

Tags:

assembly

I am quite new to this. I tried to understand the difference between the mentioned terms in a clear fashion, however, I am still confused. Here is what I have found:

In computer assembler (or assembly) language, a mnemonic is an abbreviation for an operation. It's entered in the operation code field of each assembler program instruction. for example AND AC,37 which means AND the AC register with 37. so AND, SUB and MUL are mnemonic. They are get translated by the assembler.
Instructions (statements) in assembly language are generally very simple, unlike those in high-level programming languages. Generally, a mnemonic is a symbolic name for a single executable machine language instruction (an opcode), and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an operation or opcode, plus zero or more operands.

679

asked Jul 14 '13 02:07

2 Answers

OPCODE: It is a number interpreted by your machine(virtual or silicon) that represents the operation to perform

BYTECODE: Same as machine code, except, its mostly used by a software based interpreter(like Java or CLR)

MNEMONIC: English word MNEMONIC means "A device such as a pattern of letters, ideas, or associations that assists in remembering something.". So, its usually used by assembly language programmers to remember the "OPERATIONS" a machine can do, like "ADD" and "MUL" and "MOV" etc. This is assembler specific.

MACHINE CODE: It is the sequence of numbers that flip the switches in the computer on and off to perform a certain job of work - such as addition of numbers, branching, multiplication, etc etc. This is purely machine specific and well documented by the implementers of the processor.

Assembly: There are two "assemblies" - one assembly program is a sequence of mnemonics and operands that are fed to an "assembler" which "assembles" the mnemonics and operands into executable machine code. Optionally a "linker" links the assemblies and produces an executable file.

the second "assembly" in "CLR" based languages(.NET languages) is a sequence of CLR code infused with metadata information, sort of a library of executable code, but not directly executable.

180

answered Sep 24 '22 19:09

Aniket Inge

Aniket did a good job, but I'll have a go too.

First, understand that at the lowest level, computer programs and all data are just numbers (sometimes called words), in memory of some kind. Most commonly these words are multiples of 8 bits (1's and 0's) (such as 32 and 64) but not necessarily, and in some processors each word is considerably larger. Regardless though, it's just numbers that are represented as a series of 1's and 0's, or on's and off's if you like. What the numbers mean is up to what/who-ever is reading them, and in the processor's case, it reads memory one word at a time, and based on the number (instruction) it sees, takes some action. Such actions might include reading a value from memory, writing a value to memory, modifying a value it had read, jumping to somewhere else in memory to read instructions from.

In the very early days a programmer would literally flick switches on and off to make changes to memory, with lights on or off to read out the 1's and 0's, as there were no keyboards, screens and so on. As time progressed, memory got larger, processors became more complex, display devices and keyboards for input were conceived, and with that, easier ways to program.

Paraphrasing Aniket:

The OPCODE is part of an instruction word that is interpreted by the processor as representing the operation to perform, such as read, write, jump, add. Many instructions will also have OPERANDS that affect how the instruction performs, such as saying from where in memory to read or write, or where to jump to. So if instructions are 32 bits in size for example, a processor may use 8 bits for the opcode, and 12 bits for each of two operands.

A step up from toggling switches, code might be entered into a machine using a program called a "monitor". The programmer would use simple commands to say what memory they want to modify, and enter MACHINE CODE numerically, e.g. in base 16 (hex) using 0 to 9 and A to F for digits.

Though better than toggling switches, entering machine code is still slow and error prone. A step up from that is ASSEMBLY CODE, which uses more easily remembered MNEMONICS in place of the actual number that represents an instruction. The job of the ASSEMBLER is primarily to transform the mnemonic form of the program to the corresponding machine code. This makes programming easier, particularly for jump instructions, where part of the instruction is a memory address to jump to or a number of words to skip. Programming in machine code requires painstaking calculations to formulate the correct instruction, and if some code is added or removed, jump instructions may need to be recalculated. The assembler handles this for the programmer.

This leaves BYTECODE, which is fundamentally the same as machine code, in that it describes low level operations such as reading and writing memory, and basic calculations. Bytecode is typically conceived to be produced when COMPILING a higher level language, for example PHP or Java, and unlike machine code for many hardware based processors, may have operations to support specific features of the higher level language. A key difference is that the processor of bytecode is usually a program, though processors have been created for interpreting some bytecode specifications, e.g. a processor called SOAR (Smalltalk On A RISC) for Smalltalk bytecode. While you wouldn't typically call native machine code bytecode, for some types of processors such as CISC and EISC (e.g. Linn Rekursiv, from the people who made record players), the processor itself contains a program that is interpreting the machine instructions, so there are parallels.

answered Sep 24 '22 19:09

Nick

Related questions
                            
                                Why isn't pass struct by reference a common optimization?
                            
                                Alloca implementation
                            
                                Reading program counter directly
                            
                                Why does g++ pull computations into a hot loop?
                            
                                Why are AND instructions generated?
                            
                                What does bx lr do in ARM assembly language?
                            
                                Base pointer and stack pointer
                            
                                Why does Visual Studio use xchg ax,ax
                            
                                What is the 0x10 in the "leal 0x10(%ebx), %eax" x86 assembly instruction?
                            
                                How to: pow(real, real) in x86
                            
                                x86_64 ASM - maximum bytes for an instruction?
                            
                                What functions does gcc add to the linux ELF?
                            
                                Big differences in GCC code generation when compiling as C++ vs C
                            
                                x86 LOCK question on multi-core CPUs
                            
                                Basic use of immediates vs. square brackets in YASM/NASM x86 assembly
                            
                                What is .cfi and .LFE in assembly code produced by GCC from c++ program?
                            
                                Difference between "move" and "li" in MIPS assembly language
                            
                                Why are loops always compiled into "do...while" style (tail jump)?
                            
                                Is right shift undefined behavior if the count is larger than the width of the type?
                            
                                What is the use of "push %ebp; movl %esp, %ebp" generated by GCC for x86?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With