HALICERY

free-time coding, hardware dev, articles

Top
Home 8042 Blogs About
Home IntelEssential 16/32-bit Instructions SHIFT

Last modified: Thu Jun 18 14:38:53 UTC+0200 2026 © A. Tarpai


SHIFT and ROTATE

8086 SHIFT and ROTATE

8086 could shift/rotate BYTE or WORD REG/MEM operand by 1 or count in CL. One opcode:

1 1 0 1 0 0 C W    MOD TTT R/M       <-- Shift/Rotate register or memory

TTT  Instruction     W=0: rotate byte
000  ROL             W=1: rotate word
001  ROR
010  RCL             C=0: Shift/rotate count is one
011  RCR             C=1: Shift/rotate count is specified in CL register
100  SHL/SAL
101  SHR
110  -
111  SAR

Operation:

Logical- and Arithmetic shift:


     +---+-----------------+      +----+      +-----------------+---+
     | 0 -->   SHR         | ---> | CF | <--- |        SHL    <-- 0 |
     +---+-----------------+      +----+      +-----------------+---+
          logical shift

     +---+-----------------+      +----+      +-----------------+---+
     | s -->   SAR         | ---> | CF | <--- |        SAL    <-- 0 |
     +---+-----------------+      +----+      +-----------------+---+
         arithmetic shift



Either n+1 bit Rotate Through Carry (RCL/RCR):

     +---------------------+      +----+      +---------------------+
+--> |        RCR          | ---> | CF | <--- |        RCL          | <--+
|    +---------------------+      +----+      +---------------------+    |
|                                  |  |                                  |
+----------------<-----------------+  +-------------------->-------------+



Or Carry gets a copy of the rotated bit (ROL/ROR):

     +---------------------+      +----+      +---------------------+
+--> |        ROR          | ---> | CF | <--- |        ROL          | <--+
|    +---------------------+  |   +----+   |  +---------------------+    |
|                             |            |                             |
+----------------<------------+            +--------------->-------------+

All involves the carry bit (CF). CF contains the last bit shifted out.

Rotates also affect the overflow flag:

In single-bit rotates, OF is set if the operation changes the high-order (sign) bit of the destination
operand (XOR). If the sign bit retains its original value, OF is cleared. On multibit rotates,
the value of OF is always undefined.

Notes:

"The iAPX 286 masks all shift/rotate counts to the low 5 bits. This MOD 32 operation limits the count to a maximum of 31 bits. With this change, the longest shift/rotate instruction is 39 clocks. Without this change, the longest shift/rotate instruction would be 264 clocks, which delays interrupt response until the instruction completes execution."

186/286 SHIFT and ROTATE

Added new opcode (C0/C1) to shift BYTE or WORD R/M by immediate value following opcode.

186/286 added by Immediate Count:

1 1 0 0 0 0 0 W    MOD TTT R/M    IMM8

TTT: are all the same as 8086

The count is masked.

386 SHIFT and ROTATE

Same old opcodes

1 1 0 1 0 0 C W   MOD TTT R/M          <-- Shift/Rotate register or memory by 1 or CL MOD 32
1 1 0 0 0 0 0 W   MOD TTT R/M   IMM8   <-- Shift/Rotate register or memory by IMM8 MOD 32

Operand-size (when W=1) determines 16- or 32-bit R/M shift/rotate. In case of register operand HI is unchanged, when operand-size = 16:

mov eax, 0x0001_aaaa
rcr ax, 1

eax = 0x0001_5555

All count is masked to 5 bits (CL or IMM8).

CF flag contains the value of the last bit shifted out. But for SHL and SHR, using operand-size = 16 it can occur that count >= OperandSize (count is always 5 bits). In this case CF is undefined.

386 SHxD Double Shift

386 added Double Shift, a two-operand shift operation. The new thing is that a source register provides the bits shifted into R/M. This source operand remains unchanged. Note:

SHRD r/m, reg, count

+-----------------+      +-----+      +-----------------+      +----+
|                 | ---> | tmp | ---> |   - - - - ->    | ---> | CF |
+-----------------+      +-----+      +-----------------+      +----+
       REG                                    R/M


SHLD r/m, reg, count

+----+      +-----------------+      +-----+      +-----------------+
| CF | <--- |   <- - - - - -  | <--- | tmp | <--- |                 |
+----+      +-----------------+      +-----+      +-----------------+
                   R/M                                    REG


tmp: allows overlapped operands
REG register (source) remains unaltered
CF is set to the last bit shifted out
OF is set to last sign-change (on my AuthenticAMD 00810F10)

If the count operand is 0, the flags are not affected.
If count >= OperandSize, R/M and all flags UNDEFINED

Eg. to assist BitBLT emulation.

0F 2-byte opcodes:

0F  1 0 1 0 D 1 0 C   MOD REG R/M   [IMM8]

    C=0: Shift count is IMM8
    C=1: Shift count is specified in CL register

    D=0: SHLD - Shift Left Double
    D=1: SHRD - Shift Right Double


Shifts operand-size values:

    operand-size = 32           operand-size = 16
    D=1 or D=0 and 66h          D=0 or D=1 and 66h

    SHxD r/m32, r32, imm8       SHxD r/m16, r16, imm8
    SHxD r/m32, r32, CL         SHxD r/m16, r16, CL


Using operand-size = 16 it can occur that count >= OperandSize (count is always 5 bits). In this case:


Multi-bit rotates and the OF-flag

Well, Docs say undefined but I was wondering.. lets see what really happens. Note my CPU: this is not tested on other CPU-s.

Testbench: execute ROL/ROR/RCL/RCR/SHL/SHR/SAR and SHxD by CL, where CL=0..31, and catch OF.

Result: OF is set according to the last shift/rotate step - at least on my CPU. Gives possibility to XOR any adjacent bits in R/M – see below.

CPU test for multi-bit rotates and the OF-flag

Byte rotate. AL=30h. Shift/rotate 0..31 times:

ROR CL:                ROL CL:                SHL/SAL CL:            RCR CL (CLC):          RCL CL (CLC):

30 -> 30 OF=u (0)      30 -> 30 OF=u (0)      30 -> 30 OF=u (0)      30 -> 30 OF=u (0)      30 -> 30 OF=u (0)
30 -> 18 OF=0 (1)      30 -> 60 OF=0 (1)      30 -> 60 OF=0 (1)      30 -> 18 OF=0 (1)      30 -> 60 OF=0 (1)
30 -> 0c OF=0 (2)      30 -> c0 OF=1 (2)      30 -> c0 OF=1 (2)      30 -> 0c OF=0 (2)      30 -> c0 OF=1 (2)
30 -> 06 OF=0 (3)      30 -> 81 OF=0 (3)      30 -> 80 OF=0 (3)      30 -> 06 OF=0 (3)      30 -> 80 OF=0 (3)
30 -> 03 OF=0 (4)      30 -> 03 OF=1 (4)      30 -> 00 OF=1 (4)      30 -> 03 OF=0 (4)      30 -> 01 OF=1 (4)
30 -> 81 OF=1 (5)      30 -> 06 OF=0 (5)      30 -> 00 OF=0 (5)      30 -> 01 OF=0 (5)      30 -> 03 OF=0 (5)
30 -> c0 OF=0 (6)      30 -> 0c OF=0 (6)      30 -> 00 OF=0 (6)      30 -> 80 OF=1 (6)      30 -> 06 OF=0 (6)
30 -> 60 OF=1 (7)      30 -> 18 OF=0 (7)      30 -> 00 OF=0 (7)      30 -> c0 OF=0 (7)      30 -> 0c OF=0 (7)
30 -> 30 OF=0 (8)      30 -> 30 OF=0 (8)      30 -> ...              30 -> 60 OF=1 (8)      30 -> 18 OF=0 (8)
30 -> 18 OF=0 (9)      30 -> 60 OF=0 (9)      30 ->                  30 -> 30 OF=0 (9)      30 -> 30 OF=0 (9)
30 -> 0c OF=0 (10)     30 -> c0 OF=1 (10)     30 ->                  30 -> 18 OF=0 (10)     30 -> 60 OF=0 (10)
30 -> 06 OF=0 (11)     30 -> 81 OF=0 (11)     30 ->                  30 -> 0c OF=0 (11)     30 -> c0 OF=1 (11)
30 -> 03 OF=0 (12)     30 -> 03 OF=1 (12)     30 ->                  30 -> 06 OF=0 (12)     30 -> 80 OF=0 (12)
30 -> 81 OF=1 (13)     30 -> 06 OF=0 (13)     30 ->                  30 -> 03 OF=0 (13)     30 -> 01 OF=1 (13)
30 -> c0 OF=0 (14)     30 -> 0c OF=0 (14)     30 ->                  30 -> 01 OF=0 (14)     30 -> 03 OF=0 (14)
30 -> 60 OF=1 (15)     30 -> 18 OF=0 (15)     30 ->                  30 -> 80 OF=1 (15)     30 -> 06 OF=0 (15)
30 -> 30 OF=0 (16)     30 -> 30 OF=0 (16)     30 ->                  30 -> c0 OF=0 (16)     30 -> 0c OF=0 (16)
30 -> 18 OF=0 (17)     30 -> 60 OF=0 (17)     30 ->                  30 -> 60 OF=1 (17)     30 -> 18 OF=0 (17)
30 -> 0c OF=0 (18)     30 -> c0 OF=1 (18)     30 ->                  30 -> 30 OF=0 (18)     30 -> 30 OF=0 (18)
30 -> 06 OF=0 (19)     30 -> 81 OF=0 (19)     30 ->                  30 -> 18 OF=0 (19)     30 -> 60 OF=0 (19)
30 -> 03 OF=0 (20)     30 -> 03 OF=1 (20)     30 ->                  30 -> 0c OF=0 (20)     30 -> c0 OF=1 (20)
30 -> 81 OF=1 (21)     30 -> 06 OF=0 (21)     30 ->                  30 -> 06 OF=0 (21)     30 -> 80 OF=0 (21)
30 -> c0 OF=0 (22)     30 -> 0c OF=0 (22)     30 ->                  30 -> 03 OF=0 (22)     30 -> 01 OF=1 (22)
30 -> 60 OF=1 (23)     30 -> 18 OF=0 (23)     30 ->                  30 -> 01 OF=0 (23)     30 -> 03 OF=0 (23)
30 -> 30 OF=0 (24)     30 -> 30 OF=0 (24)     30 ->                  30 -> 80 OF=1 (24)     30 -> 06 OF=0 (24)
30 -> 18 OF=0 (25)     30 -> 60 OF=0 (25)     30 ->                  30 -> c0 OF=0 (25)     30 -> 0c OF=0 (25)
30 -> 0c OF=0 (26)     30 -> c0 OF=1 (26)     30 ->                  30 -> 60 OF=1 (26)     30 -> 18 OF=0 (26)
30 -> 06 OF=0 (27)     30 -> 81 OF=0 (27)     30 ->                  30 -> 30 OF=0 (27)     30 -> 30 OF=0 (27)
30 -> 03 OF=0 (28)     30 -> 03 OF=1 (28)     30 ->                  30 -> 18 OF=0 (28)     30 -> 60 OF=0 (28)
30 -> 81 OF=1 (29)     30 -> 06 OF=0 (29)     30 ->                  30 -> 0c OF=0 (29)     30 -> c0 OF=1 (29)
30 -> c0 OF=0 (30)     30 -> 0c OF=0 (30)     30 ->                  30 -> 06 OF=0 (30)     30 -> 80 OF=0 (30)
30 -> 60 OF=1 (31)     30 -> 18 OF=0 (31)     30 ->                  30 -> 03 OF=0 (31)     30 -> 01 OF=1 (31)

u: unchanged on zero count

SAR: all cases OF=0 (msb never changes, that is the point)
SHR: OF works for single-bit shift - but further shift will keep sign zero, OF=0

CPU test for multi-bit double-shift and the OF-flag

Here src=1 and dest is zero. SHRD 0..31 times:

AuthenticAMD 00810F10

AuthenticAMD 00810F10
SHRD eax, edx, cl

00000000 -> 00000000 OF=u CF=u (0)        u: unchanged
00000000 -> 80000000 OF=1 CF=0 (1)
00000000 -> 40000000 OF=1 CF=0 (2)
00000000 -> 20000000 OF=0 CF=0 (3)
00000000 -> 10000000 OF=0 CF=0 (4)
00000000 -> 08000000 OF=0 CF=0 (5)
00000000 -> 04000000 OF=0 CF=0 (6)
00000000 -> 02000000 OF=0 CF=0 (7)
00000000 -> 01000000 OF=0 CF=0 (8)
00000000 -> 00800000 OF=0 CF=0 (9)
00000000 -> 00400000 OF=0 CF=0 (10)
00000000 -> 00200000 OF=0 CF=0 (11)
00000000 -> 00100000 OF=0 CF=0 (12)
00000000 -> 00080000 OF=0 CF=0 (13)
00000000 -> 00040000 OF=0 CF=0 (14)
00000000 -> 00020000 OF=0 CF=0 (15)
00000000 -> 00010000 OF=0 CF=0 (16)
00000000 -> 00008000 OF=0 CF=0 (17)
00000000 -> 00004000 OF=0 CF=0 (18)
00000000 -> 00002000 OF=0 CF=0 (19)
00000000 -> 00001000 OF=0 CF=0 (20)
00000000 -> 00000800 OF=0 CF=0 (21)
00000000 -> 00000400 OF=0 CF=0 (22)
00000000 -> 00000200 OF=0 CF=0 (23)
00000000 -> 00000100 OF=0 CF=0 (24)
00000000 -> 00000080 OF=0 CF=0 (25)
00000000 -> 00000040 OF=0 CF=0 (26)
00000000 -> 00000020 OF=0 CF=0 (27)
00000000 -> 00000010 OF=0 CF=0 (28)
00000000 -> 00000008 OF=0 CF=0 (29)
00000000 -> 00000004 OF=0 CF=0 (30)
00000000 -> 00000002 OF=0 CF=0 (31)

CPU tests for count >= OperandSize

CPU test for count >= OperandSize: SHRD

Here src=1 and dest is zero. SHRD 0..31 times:

It kinda rotates on my CPU, not undefined.
Also possible to rotate 16 into R/M.

AuthenticAMD 00810F10
SHRD ax, dx, cl

00000000 -> 00000000 OF=u CF=u (0)        u: unchanged
00000000 -> 00008000 OF=1 CF=0 (1)
00000000 -> 00004000 OF=1 CF=0 (2)
00000000 -> 00002000 OF=0 CF=0 (3)
00000000 -> 00001000 OF=0 CF=0 (4)
00000000 -> 00000800 OF=0 CF=0 (5)
00000000 -> 00000400 OF=0 CF=0 (6)
00000000 -> 00000200 OF=0 CF=0 (7)
00000000 -> 00000100 OF=0 CF=0 (8)
00000000 -> 00000080 OF=0 CF=0 (9)
00000000 -> 00000040 OF=0 CF=0 (10)
00000000 -> 00000020 OF=0 CF=0 (11)
00000000 -> 00000010 OF=0 CF=0 (12)
00000000 -> 00000008 OF=0 CF=0 (13)
00000000 -> 00000004 OF=0 CF=0 (14)
00000000 -> 00000002 OF=0 CF=0 (15)
00000000 -> 00000001 OF=0 CF=0 (16)
00000000 -> 00008000 OF=1 CF=0 (17)
00000000 -> 00004000 OF=1 CF=0 (18)
00000000 -> 00002000 OF=0 CF=0 (19)
00000000 -> 00001000 OF=0 CF=0 (20)
00000000 -> 00000800 OF=0 CF=0 (21)
00000000 -> 00000400 OF=0 CF=0 (22)
00000000 -> 00000200 OF=0 CF=0 (23)
00000000 -> 00000100 OF=0 CF=0 (24)
00000000 -> 00000080 OF=0 CF=0 (25)
00000000 -> 00000040 OF=0 CF=0 (26)
00000000 -> 00000020 OF=0 CF=0 (27)
00000000 -> 00000010 OF=0 CF=0 (28)
00000000 -> 00000008 OF=0 CF=0 (29)
00000000 -> 00000004 OF=0 CF=0 (30)
00000000 -> 00000002 OF=0 CF=0 (31)

CPU test for count >= OperandSize: ROR

Here ax=0x0010 and rotate 0..31 times:

AuthenticAMD 00810F10
ror ax, cl

00000010 -> 00000010 OF=u CF=u (0)        u: unchanged
00000010 -> 00000008 OF=0 CF=0 (1)
00000010 -> 00000004 OF=0 CF=0 (2)
00000010 -> 00000002 OF=0 CF=0 (3)
00000010 -> 00000001 OF=0 CF=0 (4)
00000010 -> 00008000 OF=1 CF=1 (5)
00000010 -> 00004000 OF=1 CF=0 (6)
00000010 -> 00002000 OF=0 CF=0 (7)
00000010 -> 00001000 OF=0 CF=0 (8)
00000010 -> 00000800 OF=0 CF=0 (9)
00000010 -> 00000400 OF=0 CF=0 (10)
00000010 -> 00000200 OF=0 CF=0 (11)
00000010 -> 00000100 OF=0 CF=0 (12)
00000010 -> 00000080 OF=0 CF=0 (13)
00000010 -> 00000040 OF=0 CF=0 (14)
00000010 -> 00000020 OF=0 CF=0 (15)
00000010 -> 00000010 OF=0 CF=0 (16)
00000010 -> 00000008 OF=0 CF=0 (17)
00000010 -> 00000004 OF=0 CF=0 (18)
00000010 -> 00000002 OF=0 CF=0 (19)
00000010 -> 00000001 OF=0 CF=0 (20)
00000010 -> 00008000 OF=1 CF=1 (21)
00000010 -> 00004000 OF=1 CF=0 (22)
00000010 -> 00002000 OF=0 CF=0 (23)
00000010 -> 00001000 OF=0 CF=0 (24)
00000010 -> 00000800 OF=0 CF=0 (25)
00000010 -> 00000400 OF=0 CF=0 (26)
00000010 -> 00000200 OF=0 CF=0 (27)
00000010 -> 00000100 OF=0 CF=0 (28)
00000010 -> 00000080 OF=0 CF=0 (29)
00000010 -> 00000040 OF=0 CF=0 (30)
00000010 -> 00000020 OF=0 CF=0 (31)

Nicely rotates a 16-bit register, all values and flags properly set.


XOR-ing bits by shift instructions

The OF-flag is set, when the operation changes the high-order (sign) bit of the destination operand.

This is XOR-ing and maybe it can be useful in some situations.

Where bits coming from?

The complete fig for each shift-type where bits coming from (and possibility for an XOR-test with OF):

     (SHRD)   REG          MSB                           LSB         REG  (SHLD)
     (RCR)    CF          +---+---------       ---------+---+        CF   (RCL)
     (ROR)    LSB   --->  |               R/M               |  <---  MSB  (ROL)
     (SHR)    0           +---+---------       ---------+---+        0    (SHL/SAL)
     (SAR)    MSB           |
               |            |
               +---- XOR ---+
                      |
                     OF

Single-bit rotates and the OF-flag for XOR-ing

Full analysis in case some of these might be useful.

Right single-bit rotate/shift

Lets look at what is the source of the new msb in right rotates/shifts:

                                                                   After operations
                                                                   updating SF

     SHRD:   REG-LSB      MSB                OF = MSB ^ REG-LSB    OF = SF ^ REG-LSB
     RCR:    CF          +---+---------      OF = MSB ^ CF         OF = SF ^ CF
     ROR:    LSB   --->  |                   OF = MSB ^ LSB        OF = SF ^ LSB
     SHR:    0           +---+---------      OF = MSB              OF = SF
     SAR     MSB           |                 OF = 0                OF = 0
              |            |
              +---- XOR ---+
                     |
                    OF


RCR 1: OF is XOR of original carry and the MSB           ROR 1: We can XOR msb- and lsb-bits

+---+      +---+-----------------+                       +---+-------------+---+
| C | ---> | a                   |                       | a                 b |
+---+      +---+-----------------+                       +---+-------------+---+

           +---+---+-------------+                       +---+-------------+---+       +---+
           | C   a               |                       | b                 a | --->  | b |
           +---+---+-------------+                       +---+-------------+---+       +---+
            |     |                                        |                 |          CF
            + XOR +                                        +------ XOR ------+
               |                                                    |
              OF = CF ^ MSB                                        OF = a ^ b


SHR 1: OF is original MSB                             SAR 1: OF is always zero

     +---+-----------------+                          +---+-----------------+
     | a                   |                          | a                   |
     +---+-----------------+                          +---+-----------------+

     +---+---+-------------+      +---+               +---+---+-------------+      +---+
     | 0   a               | ---> |LSB|               | a   a               | ---> |LSB|
     +---+---+-------------+      +---+               +---+---+-------------+      +---+
      |     |                      CF                  |     |                      CF
      + XOR +                                          + XOR +
         |                                                |
        OF = a                                           OF = 0


SHDR 1: XOR two operands' lsb- and msb-bit:


  REG src            R/M dest

      LSB        MSB
 ----+---+      +---+-----------
     | a | ---> | b                 SHRD R/M, REG, 1
 ----+---+      +---+-----------
       |          |
       +--  XOR --+
             |
            OF

Left single-bit rotate/shift

All left shifts OF = MSB ^ MSB-1:

                                              After operations
                                              updating SF

All left shifts OF = MSB ^ MSB-1              OF = SF ^ MSB-1

We can XOR the two msb-bits by eg. SHL 1

             +---+---+-------------+
             | a   b               |
             +---------------------+

  +---+      +---+-----------------+
  | a | <--- | b                   |
  +---+      +---+-----------------+
    |          |
    +--  XOR --+
          |
         OF

The OF flag is affected only on 1-bit shifts. For left shifts, the OF flag is set to 0 if the most-significant bit of the result is the same as the CF flag (that is, the top two bits of the original operand were the same); otherwise, it is set to 1.

XOR-ing FLAG bits

   7    6    5    4    3    2    1    0
+----+----+----+----+----+----+----+----+
| SF | ZF |    | AF |    | PF |    | CF |
+----+----+----+----+----+----+----+----+

Using lahf and AH:

   lahf             lahf              lahf             lahf
   rcr ah, 1        ror ah, 1         rol ah, 1        shl ah, 1

   OF = SF ^ CF     OF = SF ^ CF      OF = SF ^ CF     OF = SF ^ ZF
   Preserves CF     Preserves CF      CF = SF          CF = SF

Same result for WORD or DWORD rotates in AX/EAX.

So... maybe ROR is our friend here. Should work with all operand sizes.

Eg. XOR these bits: 3 and 4

000ab000
0000ab00 1
00000ab0 2
000000ab 3
b000000a 4
ab000000 5 ← here XOR

So: XOR bits n and n+1 = ROR n+2

     +---+---+---+---+---+---+---+---+
     |   |   |   |   |   |   |   |   |
     +---+---+---+---+---+---+---+---+

For the SAR instruction, the OF flag is cleared for all 1-bit shifts.

For the SHR instruction, the OF flag is set to the most-significant bit of the original operand.

     +---+-----------------+
     | a       SHR         |    We can test the original msb-bit after SHR
     +---+-----------------+
       |
      OF


     +---+-----+---+-------+      +---+
     | 0 -->     a         | ---> | C |
     +---+-----+---+-------+      +---+




16: just rotated and rotated, all values and flags properly set