Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a “PUSH” instruction's operation be performed using other instructions?

It currently seems to me that the only reason we have instructions like “Push” is to replace multiple MOV, and arithmetic instructions with a single instruction.

Is there anything “PUSH” does that cannot be accomplished by more primitive instructions?

Is “PUSH” just a single Mnemonic that compiles into multiple machine code instructions?

like image 324
Tyler Avatar asked Mar 03 '23 17:03

Tyler


1 Answers

Push is a real machine instruction (https://www.felixcloutier.com/x86/push) not just an assembler macro / pseudo-instruction. For example, push rax has a single-byte encoding of 0x50.

But yes you can emulate it using other instructions like sub rsp, 8 and a mov store. (This is normal for a CISC machine like x86!) e.g. see What is the function of the push / pop instructions used on registers in x86 assembly?

To emulate it exactly (without modifying flags), you use LEA instead of ADD/SUB.

  lea   rsp, [rsp-8]
  mov   qword [rsp], 123      ; push 123 in 64-bit mode

See also What is an assembly-level representation of pushl/popl %esp? for equivalent instructions that match the behaviour even for push rsp / pop rsp, push/pop [rsp+16], as well as for any other operand (immediate, reg, or mem).


Is there anything “PUSH” does that cannot be accomplished by more primitive instructions?

Nothing significant beyond efficiency and code-size.

Single instructions are atomic wrt. interrupts - they either happen or they don't. This is normally totally irrelevant; asynchronous interrupts don't usually look at the stack / register contents of the code that got interrupted.

PUSH can get the job done in a single byte of machine code for pushing a single register, or 2 bytes for a small immediate. A multi-instruction sequence is much larger. The architect of 8086's ISA was very focused on making small code-size possible, so yes it's totally normal to have an instruction that replaces a couple longer instructions with one short one. e.g. we have not instead of having to use xor reg, -1, and inc instead of add reg, 1. (Although again those both have different FLAGS semantics, with NOT leaving flags untouched and INC/DEC leaving CF untouched.) Not to mention all of x86's other special-case encodings, like 1-byte encodings for xchg-with-[e/r]ax. See https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code

Also efficiency: PUSH decodes to a single uop (in the fused domain) on Pentium-M and later CPUs, thanks to the stack engine that handles implicit uses of the stack pointer by instructions like push/pop and call/ret. 2 separate instructions of course decode to at least 2 uops. (Except the special case of macro-fusion of test/cmp + JCC).

On ancient P5 Pentium, emulating push with separate ALU and mov instructions was actually a win - before PPro CPUs didn't know how to break down complex CISC instructions into separate uops, and complex instructions couldn't pair in P5's dual-issue in-order pipeline. (See Agner Fog's microarch guide.) The main benefit here was being able to mix in other instructions that could pair, and to only do one big sub and then just the mov stores instead of multiple changes to the stack pointer.

This also applies to early P6-family before the stack engine. GCC with -march=pentium3 for example will tend to avoid push and just do one bigger adjustment to ESP.

like image 186
Peter Cordes Avatar answered Apr 09 '23 01:04

Peter Cordes