HALICERY

free-time coding, hardware dev, articles

Top
Home 8042 Blogs About
Home IntelEssential 16/32-bit Instructions BT

Last modified: Sun Oct 12 11:40:46 UTC+0200 2025 © A. Tarpai


386 BIT TEST (BTx) AND BIT SCAN (BSx)

386 BIT TEST

386 new instructions:

Can target any bit in register or a bit in memory from a base byte bit0 +/- 2GB bit distance. Pretty cool.

Operation:

  1. Store bit of R/M operand into CARRY
  2. Optionally set/clear/complement the bit of R/M

R/M is the destination operand

2nd operand: bit# number (0, 1, ..) in register or in immediate byte following opcode

Opcodes

0F  10TTT011   MODREGR/M         <-- BTx r/m, reg
0F  10111010   MODTTTR/M  IMM8   <-- BTx r/m, imm8

Instruction TTT either in OPCODE or MODRM byte:

TTT
100   BT  - Test Bit
101   BTS - Test Bit and Set
110   BTR - Test Bit and Reset
111   BTC - Test Bit and Complement

Can be used with the LOCK prefix

BitBase, BitOffset details

Bits are numbered from low-order to high-order within register and within memory bytes.

BitBase = bit0 of the specified byte address

Honors operand-size.

When BitBase is a memory address m16/m32

   BitOffset REG  : SIGNED 16/32-bit value
   BitOffset IMM8 : MOD 16/32


   - BitOffset <--------------------------   --------------------------> + BitOffset

                                           |
                                  -2   -1  | 0    1
   +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
   |..........................b............|.......................................|  memory bytes
   +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+


                       Note the side-effect of accessing the bit:

                       +-------------------+
                       |                   |  <-- full DWORD access when 32-bit
                       +---------+---------+
                       |         |            <-- full WORD access when 16-bit
                       +---------+


When BitBase is register: instruction takes modulo 16/32 of the bit offset operand

   BitOffset REG  : MOD 16/32
   BitOffset IMM8 : MOD 16/32


   31                  0                    15        0
   +----+----+----+----+                    +----+----+
   |........b..........|  REG32             |.........|  REG16
   +----+----+----+----+                    +----+----+

            <-----------                       <-------
        BitOffset MOD 32               BitOffset MOD 16

Test: which bit is BitOffset -1?

Is it bit0 or bit7 of the previous byte? Or the previous dword maybe?

Well, it's bit7 of the previous byte.

The rule is

   - BitOffset <--------------------------   --------------------------> + BitOffset

                                             EA
                                             |
  -9 -10 ......         -1 -2 -3      -8     | 7 6 5 4 3 2 1 0        .......   9 8
 +-----------------+   +-----------------+   +-----------------+  +-----------------+
 | 7 6 5 4 3 2 1 0 |   | 7 6 5 4 3 2 1 0 |   | 7 6 5 4 3 2 1 0 |  | 7 6 5 4 3 2 1 0 |
 +-----------------+   +-----------------+   +-----------------+  +-----------------+
       byte -2               byte -1         |     byte 0               byte 1
                                             |

Emulating BTx in memory

The operation above was confirmed with the following code with some random data and in the eg. -200 – 200 BitOffset range. Full dword read is also emulated.

VC++ 2005:

  __declspec(naked)
  int BTEMU32(char *BitBase, int BitOffset)
  {
  __asm {

    mov esi, [esp+4]           // BitBase

    mov eax, [esp+8]           // BitOffset
    sar eax, 5                 // div 32 (to dword address) - arithmetic shift for negativ BitOffset(!)
    lea esi, [esi + eax * 4]   // esi = dword address to read

    mov ecx, [esp+8]           // BitOffset --> dword bit# in CL
    and cl, 0x1F               // eg. -1 --> bit31 (and that's how BT is working)

    mov eax, [esi]             // dword read

    shr eax, cl                // bit -> bit0
    and eax, 1                 // return 0/1
    ret

    }
  }

Against the BT instruction:

  __declspec(naked)
  int BT32(char *BitBase, int BitOffset)
  {
    __asm {

      mov esi, [esp+4]    // BitBase
      mov edx, [esp+8]    // BitOffset

      xor eax, eax
      bt [esi], edx       // CF = bit
      rcl al, 1           // return 0/1
      ret

    }
  }

16-bit BT emulation

Just for completeness, operand-size = 16. The instruction disassembled by VC++:

VC++ 2005:

66 0F A3 16      bt word ptr [esi],dx

Emulated correctly by:

  __declspec(naked)
  int BTEMU16(char *BitBase, int BitOffset)
  {
  __asm {

    mov esi, [esp+4]           // BitBase

    mov eax, [esp+8]           // BitOffset
    sar eax, 4                 // div 16 (to word address) - arithmetic shift for negativ BitOffset(!)
    lea esi, [esi + eax * 2]   // esi = word address to read

    mov ecx, [esp+8]           // BitOffset --> dword bit# in CL
    and cl, 0x0F               // eg. -1 --> bit15 (and that's how 66h BT is working)

    mov ax, [esi]              // word read

    shr eax, cl                // bit -> bit0
    and eax, 1                 // return 0/1
    ret

    }
  }

386 BIT SCAN

386 new instructions:

BSF – Bit Scan Forward
BSR – Bit Scan Reverse

NO BitBase, BitOffset here.

0F BC     BSF  reg, r/m       <--- 16/32
0F BD     BSR  reg, r/m       <--- 16/32

ZF = 1  If R/M = 0 the result is undefined in REG
ZF = 0  success, REG = 0, 1, 2, ..


BSF - Bit Scan Forward            BSF - Bit Scan Reverse


operand-size = 32                 operand-size = 32
D=1 or D=0 and 66h                D=1 or D=0 and 66h

31                  0             31                  0
+----+----+----+----+             +----+----+----+----+
|........10000000000| r/m32       |000001.............| r/m32
+----+----+----+----+             +----+----+----+----+
         <-----------             ------>


operand-size = 16                 operand-size = 16
D=0 or D=1 and 66h                D=0 or D=1 and 66h

          15        0             15        0
          +----+----+             +----+----+
          |......100| r/m16       |000001...| r/m16
          +----+----+             +----+----+
                 <---             ------>