Last modified: Thu Jun 18 19:42:01 UTC+0200 2026 © A. Tarpai
All about PUSH and POP
8086 PUSH
Remember PUSH:
- decrement SP by 2
- move WORD operand where SS:SP points to
SP always points to the last item pushed on stack.
POP moves value pointed by SP then increments SP by 2 thus restoring the stack.
15 0 15 0 15 0
+--------------+ FFFF +--------------+ FFFF +--------------+ FFFF
| X X X X | | X X X X | | X X X X |
+--------------+ +--------------+ +--------------+
| X X X X | | X X X X | | X X X X |
+--------------+ +--------------+ +--------------+
| X X X X | <-- SP = 1234h | X X X X | | X X X X | <-- SP = 1234h
+--------------+ +--------------+ +--------------+
| | | VALUE | <-- SP = 1232h | VALUE |
+--------------+ +--------------+ +--------------+
| | | | | |
+--------------+ +--------------+ +--------------+
| | | | | |
+--------------+ 0 +--------------+ 0 +--------------+ 0
stack before "PUSH" stack after "POP" stack restored
^ |
| | MOV from [SS:SP]
| MOV to [SS:SP] |
| v
+--------------+ +--------------+
| VALUE | | VALUE |
+--------------+ +--------------+
PUSH writes to stack POP reads from stack
Stack Operations moves data through SS:SP. Segment override NOT possible (tested on real hw).
- Always WORD operand implied: push 8-bit is invalid, like PUSH AL or PUSH BYTE [DI]
- SP decrements/increments by 2 only
- SP wraps around to FFFEh after 0000h (push) or to 0000 from FFFE (pop)
8086 opcodes: PUSH/POP reg (1-byte op short form) PUSH/POP sr (1-byte op, no POP CS) PUSHF/POPF (1-byte op) PUSH/POP r/m
16-bit 186/286 added the new PUSHA/POPA instructions and also to push WORD immediate value or BYTE immediate value by sign-extension:
186/286 opcodes:
PUSHA/POPA (1-byte op)
60 PUSHA
61 POPA
PUSH immediate (imm16 or imm8 sign-extended value)
68/6A 0 1 1 0 1 0 s 0 [DATA] [DATA if s=0]
386 PUSH operation
ESP - the 386 CPU Stack Pointer register is 32 bit.
Implicite stack operations (PUSH, POP, CALL, RET, IRET) is based on:
- B-bit of SS descriptor
- OperandSize
Operand Size and the B-bit
The B-flag (StackAddrSize) determines the size of the stack pointer to change AND the register portion used for stack reference address calculations: lower half SP or full ESP. The B-bit is for 16-bit 8086 emulation and 64K wrap-around.
The operand size determines the amount by which the stack pointer is decremented: 2 or 4 (and this has nothing to do with the B-bit).
386 PUSH
B=1 B=0
operand-size = 32 dec ESP by 4 dec SP by 4
D=1 or D=0 and 66h move DWORD [SS:ESP] move DWORD [SS:SP]
operand-size = 16 dec ESP by 2 dec SP by 2
D=0 or D=1 and 66h move WORD [SS:ESP] move WORD [SS:SP]
Operation:
B=1 B=0
1. dec ESP by 2 or 4 dec SP by 2 or 4
+-------------------+ +---------+---------+
| ESP | | . . . . | SP | ESP HI
+-------------------+ +---------+---------+ unchanged
2. LIMIT check LIMIT check
+-------------------+ +---------+---------+
| ESP | | 0 0 0 0 | SP | temp
+-------------------+ +---------+---------+
3. address calculation address calculation
+-------------------+ +---------+---------+
| ESP | | 0 0 0 0 | SP | temp
+-------------------+ +---------+---------+
+-------------------+ +-------------------+
| SS:BASE | | SS:BASE |
+ +-------------------+ +-------------------+
_____________________________________________________________
+-------------------+ +-------------------+
| EFFECTIVE ADDRESS | | EFFECTIVE ADDRESS |
+-------------------+ +-------------------+
(.) = unchanged and not used for address calculation when B=0
When B=0, regardless of operand-size, ESP HI is unchanged. This is different from EIP, where the upper bits are zeroed if operand-size = 16. ESP HI retains its value.
386 PUSH/POP with ESP operand
"The 80386 PUSH eSP instruction pushes the value of eSP as it existed
before the instruction. This differs from the 8086, where PUSH SP
pushes the new value (decremented by 2)." (INTEL 80386 PROGRAMMER'S REFERENCE MANUAL 1986)
As if the 386 PUSH/POP operand is placed in a temporary storage first. So more detailed operation of push/pop will be as follows:
386 PUSH 386 POP
writes to stack reads from stack
from temp into temp
operand ESP limit check
| |
| reg/mem [SS:ESP]
| read |
v | memory
+--------+ | read
| temp | v
+--------+ +--------+
| | temp |
_____|______ +--------+
| | |
| decrement | _____|______
| ESP | | |
|____________| | increment |
| | ESP |
ESP limit check |____________|
| |
+--------+ +--------+
| temp | | temp |
+--------+ +--------+
| |
| memory | reg/mem
| write | write
v v
[SS:ESP] operand
This will explain the behaviour of:
- push esp
- pop esp
- push [ESP]
- pop [ESP], or any other memory addressing modes where ESP is part of the operand address.
PUSH ESP
The operand itself is the stack pointer.
386 PUSH operand = ESP temp = ESP dec ESP [SS:ESP] = temp <-- old ESP written
Moves original ESP to stack. This value points to stack item before the push:
PUSH ESP
31 0 31 0
| | | |
+--------------+ +--------------+
ESP -> | ... | | ... | <-------+
+--------------+ +--------------+ | ORIGINAL ESP
| ... | ESP -> | "ESP" | -->----+ points here
+--------------+ +--------------+
| | | |
stack before stack after
POP ESP
The operand itself is the stack pointer.
386 POP operand = ESP temp = [SS:ESP] inc ESP ESP = temp <-- old top of stack value
Therefore, on 386+ these two are equivalent:
push eax = mov esp, eax pop esp
POP ESP moves original top of stack value into ESP:
POP ESP
31 0
| |
+--------------+ ESP = value
| ... |
+--------------+
ESP -> | value |
+--------------+
| |
stack before
Notes.
- POP ESP will replace ESP. Any value can be popped (eg. for software stack switch).
- PUSH ESP followed by POP ESP restores the stack as it was.
386 PUSH/POP with ESP as BASE
On the 386, ESP can be base register encoded in the SIB byte. Implied Stack-segment register, but can be overridden by prefix. Possible addressing modes:
-
push/pop [esp] -
push/pop [esp + displacement] -
push/pop [esp + index-register] -
push/pop [esp + index-register + displacement]
The full and detailed operation:
386 PUSH 386 POP
COMPUTE OFFS ESP
OF OPERAND limit check
| |
OFFS [SS:ESP]
limit check |
| | memory
[SR:OFFS] | read
| v
| memory +--------+
| read | temp |
v +--------+
+--------+ _____|______
| temp | | |
+--------+ | increment |
_____|______ | ESP |
| | |____________|
| decrement | |
| ESP | COMPUTE OFFS
|____________| OF OPERAND
| |
ESP OFFS
limit check limit check
| |
+--------+ +--------+
| temp | | temp |
+--------+ +--------+
| |
| memory | memory
| write | write
v v
[SS:ESP] [SR:OFFS]
PUSH [ESP-BASE]
For PUSH [ESP+idx+d], the operand address itself is calculated with the original stack pointer before the operation.
386 PUSH operand EA = lea [ESP+idx+d] temp = [EA] dec ESP [SS:ESP] = temp <-- operand address and operand by old ESP
As an example push [esp+4]:
push [esp+4] which one is pushed?
31 0 31 0
| | | |
+--------------+ +--------------+
| ... | | ... |
+--------------+ +--------------+
| ... | | ... |
+--------------+ +--------------+
+4 | value1 | | value1 |
+--------------+ +--------------+
ESP -> | value2 | | value2 |
+--------------+ +--------------+
| ... | ESP -> | value1 | by original ESP
+--------------+ +--------------+
| | | |
stack before stack after
Conforming test (VC++):
push -1
push -2
push dword ptr [esp+4]
pop eax <-- EAX = -1
pop eax <-- EAX = -2
pop eax <-- EAX = -1
POP [ESP-BASE]
For POP [ESP+idx+d], the operand address itself is calculated with the stack pointer after the operation.
386 POP operand temp = [SS:ESP] <-- operand by old ESP inc ESP EA = lea [ESP+idx+d] [EA] = temp <-- operand address by new ESP
As an example pop [esp+4]:
- which location is pop-ed from stack?
- which location is written?
POP will certainly read the memory address before incrementing ESP, but it is moved to stack before or after incrementing ESP? The answer is after:
pop [esp+4] where it is written?
31 0 31 0
| | | |
+--------------+ +--------------+
| ... | | ... |
+--------------+ +--------------+
| ... | +4 | value |
+--------------+ +--------------+
| ... | ESP -> | ... |
+--------------+ +--------------+
ESP -> | value | -4 | value |
+--------------+ +--------------+
| ... | | ... |
+--------------+ +--------------+
| | | |
stack before stack after
Conforming test (VC++):
push -1
pop [esp+4]
mov eax, [esp+4] <-- EAX = -1
mov eax, [esp-4] <-- EAX = -1
386 PUSH immediate
Same old opcodes. But pushes operand-size value from instruction stream following opcode, or sign-extended byte immediate (s=1) to operand-size value. Operand-size is determined by D-bit and the 66h prefix.
NASM examples:
Operation | [BITS 32] | [BITS 16]
| |
PUSH DWORD | 68 34120000 push 0x1234 | 66 68 34120000 PUSH DWORD 0x1234
PUSH BYTE-to-DWORD | 6A FF push -1 | 66 6A FF PUSH DWORD -1
PUSH WORD | 66 68 3412 push word 0x1234 | 68 3412 PUSH 0x1234
PUSH BYTE-to-WORD | 66 6A FF push word -1 | 6A FF PUSH -1
Note that it is not possible to push sign-extended WORD-TO-DWORD.
386 PUSHF
One opcode 9C.
FLAGS (16-bit) or EFLAGS (32-bit) pushed based on Operand size. Assembler uses two mnemonics: PUSHF and PUSHFD.
Possible to push 32-bit EFLAGS in real mode: PUSHFD → 66 9C
386 PUSH SR
Segment registers are 16-bit. The 386 pushes operand-size values. How does PUSH SR works when operand-size = 32?
Test. We push DWORD -1 first, pop it, then push DS = 0010h. Observe memory content:
push -1 pop eax PUSH ds pop eax
Here EAX = FFFF0010 ← my CPU Core2 Duo decremented ESP by 4 then performed a WORD memory move. On another cpu EAX = 00000010 i.e. 16-bit SR zero-extended to 32-bit then dword push.
There are possibilities:
PUSH SR
operand-size = 32
D=1 or D=0 and 66h
31 0 31 0 31 0 31 0
| | | | | | | |
+-----------------+ +-----------------+ +-----------------+ +-----------------+
| . . . . . . . . | <-- ESP | . . . . . . . . | <-- ESP | . . . . . . . . | <-- ESP | . . . . . . . . | <-- ESP
+-----------------+ +-----------------+ +-----------------+ +-----------------+
| . . . . . . . . | | . . . . . . . . | | . . . . . . . . | | . . . . . . . . |
+-----------------+ +-----------------+ +-----------------+ +-----------------+
| | | | | | | |
| | | |
| | | |
v v v v
push SR push SR push SR push SR
31 0 31 0 31 0 31 0
| | | | | | | |
+-----------------+ +-----------------+ +-----------------+ +-----------------+
| . . . . . . . . | | . . . . . . . . | | . . . . . . . . | | . . . . . . . . |
+--------+--------+ +--------+--------+ +--------+--------+ +--------+--------+
| . . . .| SR | <-- ESP | 0 0 0 0| SR | <-- ESP | s s s s| SR | <-- ESP | x x x x| SR | <-- ESP
+--------+--------+ +--------+--------+ +--------+--------+ +--------+--------+
| | | | | | | |
(.) unchanged (0) zero-extended (s) sign-extended (x) undefined
Eg. Core2Duo Eg. AMD Ryzen Eg. ??? Eg. Pentium?
WORD move DWORD move zero padded DWORD move DWORD move
386 PUSHA/POPA
16- or 32-bit registers are pushed/popped (based on operand-size attribute).
For ESP:
- PUSHA stores ESP value before PUSHA. Same mechanism as PUSH ESP.
- POPA discards ESP – i.e. (ESP) is freely available for eg. as an extra parameter
After pusha, ESP points to EDI on stack (last push). Other registers can be found at these offsets [ESP + d]:
32-bit stack view 16-bit stack view 32-bit stack view
DWORD registers WORD registers WORD registers
8 x 4 = 32 bytes 8 x 2 = 16 bytes 4 x 4 = 16 bytes
31 0 15 0 31 0
| ... | +32 <----+ | ... | +16 | ... | +16
+-------------+ | +--------+ +------+------+
| EAX | +28 | | AX | +14 | AX | CX | +12
+-------------+ | +--------+ +------+------+
| ECX | +24 | ORIGINAL | CX | +12 | DX | BX | +8
+-------------+ | ESP +--------+ +------+------+
| EDX | +20 | | DX | +10 | (SP) | BP | +4
+-------------+ | +--------+ +------+------+
| EBX | +16 | | BX | +8 | SI | DI | <-- SP
+-------------+ | +--------+ +------+------+
| (ESP) | +12 -->--+ | (SP) | +6 | |
+-------------+ +--------+
| EBP | +8 | BP | +4
+-------------+ +--------+
| ESI | +4 | SI | +2
+-------------+ +--------+
| EDI | <-- ESP | DI | <-- SP
+-------------+ +--------+
| | | | | |
| |
v v
PUSHA PUSHA
Note. PUSHA/POPA is invalid in 64-bit mode. It's a shame.
Some 16/32-bit live tests
Override B-bit by address-size prefix 67h?
The setting of the B-bit cannot be overridden by address-size prefix 67h – at least on my hardware. Tested:
[BITS 16] 66 BC 00001000 MOV ESP, 1*1024*1024 ; 1MB: 0010_0000 ; TEST PUSH 67 50 a32 PUSH AX ; ESP = 0010_FFFE <-- still only SP changes!
BASE + SP vs. BASE + ESP address calculation
Test for BASE + SP vs. BASE + ESP address calculation:
- after switching to 32-code but keeping B=0 in SS Descriptor
- set ESP high up
- push something
- check memory location for pushed data
Here D=1 but B=0 (only SP changes):
MOV ESP, 0x800000 <-- set to 8MB
push -2 <-- ESP = 80FFFC: DWORD move but only SP has changed
push moved to 0080FFFC or 0000FFFC?
MOV eax, [0x80fffc] <-- read 0080FFFC: EAX is NOT -2!
MOV eax, [0xfffc] <-- read 0000FFFC: EAX is -2. CORRECT: moved to 0000FFFC (true on Hy and Core2 Duo)
Limit check when B=0
Limit check:
Q: "before or after zeroing HI?"
A: It should be after, otherwise any non-zero ESP HI would cause exception 12.
Test for limit check by setting ESP HI to non-zero (Limit=FFFF). No exception 12:
[BITS 16] 66 BC 00FF1000 MOV ESP, 0x10FF00 ; <-- ESP= 0010_FF00 (ESP HI set) 66 58 POP EAX ; <-- ESP= 0010_FF04
16-bit tests
Indeed only SP changes. Tested in Real Mode after Reset. B=0.
[BITS 16]
66 BC 00001000 MOV ESP, 0x100000 <-- 1M
9C PUSHF <-- WORD push. ESP = 0010_FFFE
66 BC 00001000 MOV ESP, 0x100000 <-- 1M
66 9C PUSHFD <-- DWORD push. ESP = 0010_FFFC
When B=1 in SS descriptor the same code operates on full ESP:
[BITS 16]
66 BC 00001000 MOV ESP, 0x100000 <-- 1M
9C PUSHF <-- WORD push. ESP = 000F_FFFE
66 BC 00001000 MOV ESP, 0x100000 <-- 1M
66 9C PUSHFD <-- DWORD push. ESP = 000F_FFFC
We can also PUSH DWORD in Real Mode (SP = SP - 4, move DWORD [SSSP])
[BITS 16]
66 68 78563412 push dword 0x12345678 <--- imm32
66 50 push eax <--- reg32
66 6A FF push dword -1 <--- sign-extended byte to imm32
Cool to save some code bytes, ex. zero out ES and DS:
[BITS 16] 666A00 push dword 0 07 pop es 1F pop ds
32-bit Tests
We can also PUSH WORD in 32-bit mode (D=1). Decrements ESP by 2 only so watch out coding PUSH AX in 32-bit mode. Can be safely POP with 66H POP though..
Tested in 32-bit mode (VC++):
[BITS 32]
66 50 push ax ; ESP = 0x0019F978
; ESP = 0x0019F976 <-- 32-bit misaligned
Note that the CPU itself has no problem operating on misaligned stack – even any byte-aligned.
386 PUSH operand 8086 PUSH operand temp <- operand dec SP dec ESP [SS:SP] <- operand [SS:ESP] <- temp 386 POP operand 8086 POP operand temp = [ESP] operand <- [SS:SP] inc ESP inc SP operand = temp
386 PUSH operand 386 POP operand temp <- operand temp = [ESP] dec ESP inc ESP [SS:ESP] <- temp operand = temp
This gives different behaviour on 8086 and newer processors when it comes to SP itself.
before ESP after ESP
increment increment
location temp = [ESP]
pop-ed
location operand = temp
written
8086 8086 ____________________________________ PUSH operand POP operand ____________________________________ dec SP operand = [SP] [SP] = operand inc SP 386 386 ____________________________________ PUSH operand POP operand ____________________________________ temp = operand temp = [ESP] dec ESP inc ESP [ESP] = temp operand = temp
POP ESP
"increments the stack pointer (ESP) before data at the old top of stack is written into the destination"
PUSH ESP
memory memory memory
address address address
31 0 31 0 31 0
| | 00001238 | | 00001238 | | 00001238
+------------+ +------------+ +------------+
ESP -> | . . . | 00001234 | . . . | 00001234 | . . . | 00001234
+------------+ +------------+ +------------+
| | 00001230 ESP -> | | 00001230 ESP -> | 00001234 | 00001230
+------------+ +------------+ +------------+
| | 0000122C | | 0000122C | | 0000122C
temp = ESP decrement ESP [ESP] = temp
1. save ESP into temp
2. decrement ESP
3. move temp value where ESP points to
POP ESP
memory memory memory
address address address
31 0 31 0 31 0
| | 00001238 | | 00001238 | | 00001238
+------------+ +------------+ +------------+
| . . . | 00001234 ESP -> | . . . | 00001234 ESP*-> | . . . | 00001234
+------------+ +------------+ +------------+
ESP -> | 00001234 | 00001230 | 00001234 | 00001230 | 00001234 | 00001230
+------------+ +------------+ +------------+
| | 0000122C | | 0000122C | | 0000122C
temp = [ESP] increment ESP ESP = temp*
* Demonstrates the above PUSH ESP followed by POP ESP: restores the stack as it was.
But any value can be popped into ESP and point anywhere (software stack swith).
i386.pdf
Real Address Mode Exceptions
Before executing PUSHA or PUSHAD, the 80386 shuts down if SP or
ESP equals 1, 3, or 5; if SP or ESP equals 7, 9, 11, 13, or 15, exception
13 occurs
Hy AMD test:
shuts down: 1,3,5,7
exc12(!): 9,11,13,15
Rest is fine.
"Exceptions do not return error codes in real-address mode." IA-32