Intel 8086

From CPCWiki - THE Amstrad CPC encyclopedia!
Jump to: navigation, search

The Intel 8086 is a landmark 16‑bit microprocessor introduced by Intel in 1978. It was the first in the x86 family and established many architectural conventions that continue to influence modern personal computing.

With a hybrid internal design—featuring 16‑bit registers and a 16‑bit arithmetic logic unit (ALU) paired with a 20‑bit external address bus—the 8086 could directly address 1 megabyte of memory, a significant leap over its 8‑bit predecessors.

Although there were definitely other CPUs in use in the 1980s, the vast majority of microcomputers people had at home or at the office used either a MOS 6502 (or one of its variants), a Zilog Z80, an early member of the Intel 8086 family, or a Motorola 68000.


History

After the release of the Intel 8080 CPU, Intel began working on the iAPX 432 project. It was an ambitious 32‑bit design—aimed at supporting advanced, high‑level programming features in hardware—which took several years and a large team to develop, partly because it awaited further improvements in chip density per Moore’s Law.

Meanwhile, to quickly counter the competition, Intel rushed a simpler, lower‑risk design: the 8086. This chip, developed as an incremental evolution of the 8080 and managed by a separate team, was ready for mass market in 1978.

The chip’s design was partly influenced by the need to maintain some backward compatibility with 8‑bit software while also providing a richer instruction set for high‑level languages such as Pascal and PL/M.

Although the IBM PC later used the nearly identical 8088 (which featured an 8‑bit external data bus for cost savings), the 8086 itself became the architectural blueprint for the x86 family, directly influencing later processors.

As for the iAPX 432, it turned out to be a commercial failure, and was discontinued in 1986. Intel then tried to venture into RISC CPUs in the late 1980s with the i860 and i960. But it was ultimately unsuccessful.


Architecture

Most sources claim that the 8086 has about 29,000 transistors. But actually, it has only 19,618 transistors. Source

To put it into perspective, 64KB of DRAM contains 524,288 transistors, as 1 bit of DRAM needs 1 transistor.

Fun fact: The original IBM PC came with 16KB of memory. Source

Microcode

Whereas the Z80 and the 6502 CPUs use a Decode ROM (PLA), the 8086 uses microcode instead.

To execute a machine instruction, the computer internally executes several simpler micro-instructions, specified by the microcode. In other words, microcode forms another layer between the machine instructions and the hardware.

The 8086's microcode ROM holds 512 micro-instructions, each 21 bits wide. The microcode engine is assisted by two smaller ROMs: the "Group Decode ROM" to categorize machine instructions, and the "Translation ROM" to branch to microcode subroutines for address calculation and other roles.

See: Group Decode ROM viewer How the 8086 processor's microcode engine works 8086 microcode disassembled

Reverse-engineering the: multiplication algorithm division microcode string operations conditional jumps register codes ModR/M addressing microcode instructions length flags circuitry interrupt circuitry HALT circuitry ALU circuitry in the Intel 8086 processor

Block Diagrams

Internally, the 8086 features a 16‑bit Execution Unit (EU) that performs arithmetic, logic, and control functions, while simultaneously a separate Bus Interface Unit (BIU) handles all data transfers and external communications.

The BIU includes a 6‑byte prefetch queue (4-byte for 8088). The EU fetches instructions from the prefetch queue (not directly from memory). It has no direct connection to the external system bus, relying entirely on the BIU for data and instruction access.

Since EU and BIU are independent, whenever the EU starts decoding and executing fetched instructions, the BIU actively fetches additional instruction bytes to keep the queue filled.

Only the BIU differs between the 8088 and 8086. As the EU is the same for both, the programming instructions are exactly the same for each. Programs written for the 8086 can be run on the 8088 without any changes.

See: The 8086 processor's microcode pipeline from die analysis Intel 8088 processor's instruction prefetch circuitry Inside the Intel 8088 processor's bus interface state machine

Block-diagram-of-8086.jpg


8086 die blocks.png

Memory Segmentation

To overcome the 16‑bit limitation of its registers while still addressing 1 MB of memory, the 8086 employs a segmented memory model.

In this scheme, the BIU forms memory addresses by shifting a 16‑bit segment register four bits to the left and then adding a 16‑bit offset. This results in a 20‑bit physical address.

Although this model can be seen as complex, it allowed small programs (fitting within a 64KB segment) to be loaded at a fixed offset, simplifying relocation in many cases.

See: Reverse-engineering the 8086 processor's address and data pin circuits

Register File

Register Size Description Notes
AX (Accumulator) 16-bit Primary register for arithmetic, logic, I/O. Can be accessed as two 8-bit registers: AH (High) and AL (Low). Often an implied operand.
BX (Base) 16-bit General-purpose, often used as a base pointer for memory addressing. Can be accessed as BH and BL. Only GP register usable as an offset in memory addressing (e.g., `[BX]`).
CX (Count) 16-bit General-purpose, often used as a loop counter (`LOOP` instruction) and for string operations (`REP` prefixes). Can be accessed as CH and CL.
DX (Data) 16-bit General-purpose, used for I/O port addressing (`IN`, `OUT`), and holds high word in 16x16 multiplication / 32/16 division. Can be accessed as DH and DL.
SP (Stack Pointer) 16-bit Points to the top of the current stack (offset within SS). Used implicitly by `PUSH`, `POP`, `CALL`, `RET`, interrupts.
BP (Base Pointer) 16-bit Points to data within the stack segment (offset within SS). Often used to access function parameters and local variables on the stack.
SI (Source Index) 16-bit Used as a source pointer offset (usually within DS) for string operations. Can be used as a general-purpose index register. Default segment is DS, can be overridden.
DI (Destination Index) 16-bit Used as a destination pointer offset (usually within ES) for string operations. Can be used as a general-purpose index register. Default segment is ES for string ops, can be overridden.
IP (Instruction Pointer) 16-bit Holds the offset address of the next instruction to be executed within the current Code Segment (CS). Analogous to Program Counter (PC). Cannot be directly manipulated by most instructions (modified by jumps, calls, etc.). Physical address = (CS * 16) + IP.
FLAGS 16-bit Contains status and control flags:
Status Flags:
* bit 0 - CF (Carry Flag)
* bit 2 - PF (Parity Flag)
* bit 4 - AF (Auxiliary Carry Flag)
* bit 6 - ZF (Zero Flag)
* bit 7 - SF (Sign Flag)
* bit 11 - OF (Overflow Flag)
Control Flags:
* bit 8 - TF (Trap Flag)
* bit 9 - IF (Interrupt Enable Flag)
* bit 10 - DF (Direction Flag)
(Other bits are undefined/reserved in 8086)
AF used for BCD arithmetic. DF controls string op direction (inc/dec SI/DI). TF enables single-stepping. IF enables maskable interrupts.
CS (Code Segment) 16-bit Points to the base address of the current code segment. Used with IP to find the next instruction.
DS (Data Segment) 16-bit Points to the base address of the current data segment. Default segment for most data access.
SS (Stack Segment) 16-bit Points to the base address of the current stack segment. Used with SP and BP.
ES (Extra Segment) 16-bit Points to the base address of an extra data segment. Often used as the destination segment for string operations (with DI).


Instruction Set

As a complex instruction set computer (CISC), the 8086 supports a rich array of operations, including multiple addressing modes such as register, immediate, and memory addressing.

The 8086's instruction set was designed with a new concept, the "ModR/M" byte, which usually follows the opcode byte. The ModR/M byte specifies the memory addressing mode and the register (or registers) to use, allowing that information to be moved out of the opcode.

Although most operations execute on 16‑bit operands, the chip allows manipulation of 8‑bit data as well—an important feature for compatibility with legacy 8‑bit software.

See: Complete 8086 instruction set Tracing the roots of the 8086 instruction set to the Datapoint 2200 minicomputer

8086 Instruction Set Summary
Mnemonic Description Operation Flags Affected
OF SF ZF AF PF CF
AAA ASCII Adjust After Addition Adjust AL after BCD addition U U U * U *
AAD ASCII Adjust Before Division Adjust AX before BCD division U * * U * U
AAM ASCII Adjust After Multiply Adjust AX after BCD multiplication U * * U * U
AAS ASCII Adjust After Subtraction Adjust AL after BCD subtraction U U U * U *
ADC Add with Carry Destination + Source + CF → Destination * * * * * *
ADD Add Destination + Source → Destination * * * * * *
AND Logical AND Destination ∧ Source → Destination 0 * * U * 0
CALL Call Procedure Push IP (and CS); Target → IP (and CS)
CBW Convert Byte to Word Sign extend AL into AH
CLC Clear Carry Flag 0 → CF 0
CLD Clear Direction Flag 0 → DF
CLI Clear Interrupt Flag 0 → IF
CMC Complement Carry Flag ¬CF → CF *
CMP Compare Destination - Source (Flags set, result discarded) * * * * * *
CMPSB Compare String Byte Compare byte [DS:SI] with [ES:DI]; Update SI, DI * * * * * *
CMPSW Compare String Word Compare word [DS:SI] with [ES:DI]; Update SI, DI * * * * * *
CWD Convert Word to Double Word Sign extend AX into DX:AX
DAA Decimal Adjust After Addition Adjust AL after packed BCD addition U * * * * *
DAS Decimal Adjust After Subtraction Adjust AL after packed BCD subtraction U * * * * *
DEC Decrement by 1 Destination - 1 → Destination * * * * *
DIV Unsigned Divide AX / Src(Byte) → AL (Q), AH (R)
DX:AX / Src(Word) → AX (Q), DX (R)
U U U U U U
ESC Escape (to coprocessor) Used for floating-point/coprocessor instructions
HLT Halt Halt processor until interrupt or reset
IDIV Signed Divide AX / Src(Byte) → AL (Q), AH (R)
DX:AX / Src(Word) → AX (Q), DX (R)
U U U U U U
IMUL Signed Multiply AL * Src(Byte) → AX
AX * Src(Word) → DX:AX
* U U U U *
IN Input from Port Port → AL or AX
INC Increment by 1 Destination + 1 → Destination * * * * *
INT Interrupt Push Flags, CS, IP; Vector → CS:IP Clears TF, IF
INTO Interrupt on Overflow If OF=1 then INT 4 Clears TF, IF if trap
IRET Interrupt Return Pop IP, CS, Flags * * * * * *
Jcc Conditional Jump (e.g., JE, JNE, JG...) If condition is met then IP + disp → IP
JMP Unconditional Jump Target → IP (and possibly CS)
LAHF Load AH from Flags Low byte of Flags → AH
LDS Load Pointer using DS mem → reg; mem+2 → DS
LEA Load Effective Address Effective Address of Source → Destination Register
LES Load Pointer using ES mem → reg; mem+2 → ES
LOCK Lock Bus Prefix Assert LOCK# signal during next instruction
LODSB Load String Byte [DS:SI] → AL; Update SI
LODSW Load String Word [DS:SI] → AX; Update SI
LOOP Loop CX - 1 → CX; If CX ≠ 0 then Jump
LOOPE / LOOPZ Loop while Equal / Zero CX - 1 → CX; If CX ≠ 0 and ZF=1 then Jump
LOOPNE / LOOPNZ Loop while Not Equal / Not Zero CX - 1 → CX; If CX ≠ 0 and ZF=0 then Jump
MOV Move Source → Destination
MOVSB Move String Byte Move byte [DS:SI] to [ES:DI]; Update SI, DI
MOVSW Move String Word Move word [DS:SI] to [ES:DI]; Update SI, DI
MUL Unsigned Multiply AL * Src(Byte) → AX
AX * Src(Word) → DX:AX
* U U U U *
NEG Negate (Two's Complement) 0 - Destination → Destination * * * * * *
NOP No Operation No operation
NOT Logical NOT (One's Complement) ¬Destination → Destination
OR Logical OR Destination ∨ Source → Destination 0 * * U * 0
OUT Output to Port AL or AX → Port
POP Pop Word from Stack [SS:SP] → Destination; SP + 2 → SP
POPF Pop Flags from Stack [SS:SP] → Flags; SP + 2 → SP * * * * * *
PUSH Push Word onto Stack SP - 2 → SP; Source → [SS:SP]
PUSHF Push Flags onto Stack SP - 2 → SP; Flags → [SS:SP]
RCL Rotate Left through Carry Rotate Destination left, CF fills LSB, MSB fills CF * *
RCR Rotate Right through Carry Rotate Destination right, CF fills MSB, LSB fills CF * *
REP String Repeat Prefix Repeat following string op while CX ≠ 0
REPE / REPZ Repeat While Equal / Zero Prefix Repeat following string op while CX ≠ 0 and ZF=1
REPNE / REPNZ Repeat While Not Equal / Not Zero Prefix Repeat following string op while CX ≠ 0 and ZF=0
RET Return from Procedure Pop IP (and CS) from stack
ROL Rotate Left Rotate Destination left, MSB fills LSB and CF * *
ROR Rotate Right Rotate Destination right, LSB fills MSB and CF * *
SAHF Store AH into Flags AH → Low byte of Flags * * * * * *
SAL / SHL Shift Arithmetic/Logical Left Shift Destination left, 0 fills LSB, MSB fills CF * * * U * *
SAR Shift Arithmetic Right Shift Destination right, MSB preserved, LSB fills CF * * * U * *
SBB Subtract with Borrow Destination - Source - CF → Destination * * * * * *
SCASB Scan String Byte Compare AL with byte [ES:DI]; Update DI * * * * * *
SCASW Scan String Word Compare AX with word [ES:DI]; Update DI * * * * * *
SHR Shift Logical Right Shift Destination right, 0 fills MSB, LSB fills CF * 0 * U * *
STC Set Carry Flag 1 → CF 1
STD Set Direction Flag 1 → DF
STI Set Interrupt Flag 1 → IF
STOSB Store String Byte AL → [ES:DI]; Update DI
STOSW Store String Word AX → [ES:DI]; Update DI
SUB Subtract Destination - Source → Destination * * * * * *
TEST Logical Compare (AND) Destination ∧ Source (Flags set, result discarded) 0 * * U * 0
WAIT Wait Wait for TEST# pin active (for coprocessor sync)
XCHG Exchange Source ↔ Destination
XLAT / XLATB Translate Byte AL → [DS:BX + AL]
XOR Logical Exclusive OR Destination ⊕ Source → Destination 0 * * U * 0

Note: Some instructions like LOOPE and LOOPZ are mnemonics for the same opcode. They are provided to match different programming contexts: LOOPE when thinking in terms of equality (e.g., a comparison was equal), LOOPZ when thinking in terms of zero (e.g., result was zero).

Secret Instruction

The secret instruction is SALC (Set AL register to Carry). Its opcode is 0xD6. Intel put this in all its x86 processors but didn't document it, using it as a trap. If a manufacturer cloned an Intel processor, the presence of the SALC instruction would prove that the clone stole Intel's microcode.

Intel sued NEC for making 8086 clones, claiming that NEC ripped off Intel's microcode. NEC claimed they wrote their own microcode. NEC's chip didn't have the secret SALC instruction and Intel lost the case.

See: Undocumented 8086 instructions, explained by the microcode


8087 Floating Point Unit

Intel introduced the 8087 chip in 1980 to improve floating-point performance on 8086/8088 computers.

Since early microprocessors were designed to operate on integers, arithmetic on floating point numbers was slow, and transcendental operations such as trig or logarithms were even worse. But the 8087 co-processor greatly improved floating point speed, up to 100 times faster.

The benefits of floating point hardware are so great that Intel started integrating the floating-point unit into the processor with the 80486DX in 1989.

See: Inside the die High-density ROM Extracting ROM constants Fast bit shifter 8087 FPU reverse engineered


Links