source

04 Ilp And Register Renaming

ILP and Register Renaming

Prerequisites: 02-Pipelining-and-Hazards Learning Goals: Understand ILP as an upper bound on processor parallelism, how register renaming removes false dependencies, and how ILP relates to actual IPC.


Instruction-Level Parallelism (ILP)

ILP = the maximum number of instructions that can be executed per cycle in an ideal processor.

\text{ILP} = \frac{\text{# Instructions}}{\text{# Cycles required (ideal)}}

ILP ≥ IPC always — real processor constraints reduce IPC below the ILP upper bound.


The Execute Stage and Forwarding


Types of Dependencies Recap

TypeTrue/FalseImpact on CPI
RAWTrueDirectly limits CPI
WAWFalse (Name)Can cause OOO issues
WARFalse (Name)Can cause stalls

Removing False Dependencies

Why False Dependencies Exist

Two instructions use the same register to hold different, unrelated values. The processor sees a “conflict” even though there is no actual data flow between them.

Method: Register Renaming

Architectural registers: registers visible to the programmer (e.g., R0–R31) Physical registers: all storage locations available in hardware (more than architectural)

The processor maps architectural registers to physical registers dynamically, giving each instruction write a fresh physical register. This eliminates WAW and WAR conflicts.


Register Allocation Table (RAT)

The RAT tracks which physical register currently holds the value for each architectural register.

How RAT Works

Effect of Renaming

Renaming improves CPI and IPC by removing artificial stalls caused by false dependencies.


Steps to Compute ILP

  1. Rename registers as they are written, tracking which physical registers are free
  2. “Execute” the program assuming infinite resources and no false dependencies
  3. Determine the earliest cycle each instruction can execute given only true (RAW) dependencies
  4. ILP = Instructions / Cycles

ILP with Control Dependencies


ILP vs. IPC

MetricProcessor ModelBranch PredictionIssue Width
ILPIdeal, infinitePerfectInfinite
IPCReal, limitedReal predictorFinite N

ILP is always ≥ IPC. The gap between them comes from:

In-Order vs. Out-of-Order

Processor TypeLimiting Factor
In-order, narrow issueIssue width (narrow issue is more limiting than in-order)
In-order, wide issueIn-order constraint (more limiting than width)

If a processor can issue 4+ instructions per cycle (wide-issue), it should also be out-of-order to fully exploit the width. It also needs register renaming to eliminate false dependencies.


Summary

Key Takeaways:

Common Patterns:

See Also: 02-Pipelining-and-Hazards, 05-Tomasulo-and-OOO-Execution Next: 05-Tomasulo-and-OOO-Execution