source

02 Network Layer And Routing

Network Layer and Routing

Prerequisites: 01-Fundamentals-and-Architecture Learning Goals: After reading this, you will understand IP addressing and routing, distinguish intradomain vs interdomain routing, explain BGP policy routing, and understand router internals and optimization.

Introduction

The Network Layer is responsible for moving datagrams from source host to destination host across the Internet. This involves two critical functions:

  1. Forwarding: Moving packets from input to output within a router (data plane)
  2. Routing: Determining the path packets take through the network (control plane)

This file covers routing algorithms, protocols (OSPF, RIP, BGP), the business side of the Internet, and router architecture.


Routing vs. Forwarding

Key Distinction

Forwarding (Data Plane):

Routing (Control Plane):

Analogy:


Intradomain Routing

Definition: Routing within a single administrative domain (e.g., within one ISP, one university network, one company)

Also called: Interior Gateway Protocols (IGPs)

Goal: Find the shortest/fastest path within the domain

Two Main Algorithm Classes:

  1. Link State (Global knowledge)
  2. Distance Vector (Local knowledge)

Concept

How It Works:

Process:

  1. Discover neighbors: Each router identifies its directly connected neighbors
  2. Measure link costs: Determine the cost to each neighbor (delay, bandwidth, etc.)
  3. Flood topology info: Broadcast Link State Advertisements (LSAs) to all routers
  4. Build topology map: Each router constructs a complete graph of the network
  5. Compute paths: Run Dijkstra’s algorithm to find shortest paths to all destinations

Dijkstra’s Algorithm

Input: Graph with nodes (routers) and weighted edges (link costs)

Output: Shortest path tree from source to all other nodes

Algorithm:

Initialize:
  - Set distance to source = 0
  - Set distance to all others = ∞
  - Mark all nodes as unvisited

Repeat until all nodes visited:
  1. Select unvisited node u with minimum distance
  2. Mark u as visited
  3. For each unvisited neighbor v of u:
       if distance[u] + cost(u,v) < distance[v]:
          distance[v] = distance[u] + cost(u,v)
          predecessor[v] = u

Complexity: O(n²) where n = number of nodes

Example:

Network:
    A ---2--- B
    |         |
    1         3
    |         |
    C ---1--- D

From A:
  Step 1: Visit A (dist=0)
          Update: C=1, B=2
  Step 2: Visit C (dist=1)
          Update: D=2
  Step 3: Visit B (dist=2) or D (dist=2)
          ...

Result: Shortest paths from A:
  A→C: 1
  A→B: 2
  A→D: 2
  A→B→D: 5 (not chosen, longer than A→C→D)

OSPF (Open Shortest Path First)

Type: Link State protocol

Key Features:

1. Hierarchical Structure:

2. Link State Advertisements (LSAs):

3. Cost Metric:

4. Fast Convergence:

Advantages:

Disadvantages:


Distance Vector Routing

Concept

How It Works:

Process:

  1. Each router maintains a distance vector: Distances to all destinations
  2. Periodically, routers exchange distance vectors with neighbors
  3. Each router updates its table using the Bellman-Ford equation
  4. Process repeats until convergence (no more changes)

Bellman-Ford Equation

Formula:

D_x(y) = min over all neighbors v { cost(x,v) + D_v(y) }

Meaning:

Example:

Router A wants to reach Router D

A has neighbors: B (cost 2), C (cost 1)
B says: D is 3 hops away
C says: D is 1 hop away

D_A(D) = min {
  cost(A,B) + D_B(D) = 2 + 3 = 5,
  cost(A,C) + D_C(D) = 1 + 1 = 2  ← Choose this
}

Result: A routes to D via C, total distance = 2

RIP (Routing Information Protocol)

Type: Distance Vector protocol

Key Features:

1. Metric:

2. Updates:

3. Triggered Updates:

Advantages:

Disadvantages:


Count-to-Infinity Problem

Problem: When a link cost increases or a link fails, Distance Vector protocols can loop indefinitely while incrementing distance.

Example:

Initial state:
  A → B (cost 1)
  B → C (cost 1)
  A knows: C is 2 hops away (via B)
  B knows: C is 1 hop away (direct)

Link B-C fails:
  B no longer has direct route to C
  B receives update from A: "C is 2 hops away"
  B thinks: I can reach C via A (cost = 1 + 2 = 3)
  B updates: C is 3 hops away

  A receives update from B: "C is 3 hops away"
  A thinks: I can reach C via B (cost = 1 + 3 = 4)
  A updates: C is 4 hops away

  This continues: 3 → 4 → 5 → 6 → ... → ∞

Why It Happens:

Mitigation: Poison Reverse:

Example with Poison Reverse:

A routes to C via B
  A tells B: "My distance to C = ∞" (poison)

When B-C link fails:
  B receives from A: "C = ∞"
  B cannot use A as path to C
  B immediately marks C as unreachable

No count-to-infinity!

Limitation: Poison Reverse only prevents 2-node loops, not loops involving 3+ routers.


AspectLink State (OSPF)Distance Vector (RIP)
KnowledgeComplete topologyOnly neighbors
AlgorithmDijkstraBellman-Ford
UpdatesTriggered by changesPeriodic (30s)
ConvergenceFast (seconds)Slow (minutes)
LoopsNo loopsPossible (count-to-infinity)
MemoryHigh (full topology)Low (distance table)
ComplexityO(n²) or O(n log n)O(n × neighbors)
ScalabilityHierarchical (areas)Limited (max 15 hops)

Interdomain Routing

The Internet Ecosystem

Hierarchy:

Tier-1 ISPs (Global Backbone):

Tier-2 ISPs (Regional):

Tier-3 ISPs (Access):

Other Key Players:


Autonomous Systems (AS)

Definition: A group of routers under the same administrative authority with a unified routing policy

AS Number (ASN):

Routing Domains:


BGP (Border Gateway Protocol)

Purpose: The standard protocol for exchanging routing information between Autonomous Systems

Type: Path Vector protocol (variant of Distance Vector)

Key Features:

Two Flavors:

1. eBGP (External BGP):

2. iBGP (Internal BGP):

BGP Message:

BGP Advertisement = {
  Prefix: 192.168.1.0/24
  AS Path: [AS100, AS200, AS300]
  Next Hop: 10.0.0.1
  Attributes: LocalPref, MED, ...
}

Business Relationships

1. Customer-Provider (Transit):

Example:

Tier-3 ISP (customer) ← pays → Tier-1 ISP (provider)

2. Peering:

Example:

Tier-1 ISP A ←→ Tier-1 ISP B (peers)

Why Peer?


BGP Route Selection

Problem: Router receives multiple BGP advertisements for the same prefix. Which to choose?

BGP Decision Process (in order):

  1. Highest LocalPref (Local Preference)

    • Operator-defined preference
    • Higher value = more preferred
    • Used to control outbound traffic (which exit point to use)
  2. Shortest AS Path

    • Fewer ASes traversed = shorter path
    • Can be manipulated via AS path prepending
  3. Lowest Origin Type

    • IGP < EGP < Incomplete
  4. Lowest MED (Multi-Exit Discriminator)

    • Used to control inbound traffic (preferred entrance point)
    • Lower value = more preferred
    • Only compared between routes from the same neighboring AS
  5. eBGP over iBGP

    • Prefer routes learned externally
  6. Lowest IGP Cost to Next Hop

    • Hot Potato Routing: Get rid of traffic ASAP
    • Choose exit point closest to current router
  7. Lowest Router ID

    • Tiebreaker

Key Insight: BGP is policy-driven, not distance-driven. Business relationships dictate routing decisions.


BGP Policies and Traffic Control

Import Preference (Which routes to accept and prefer):

Rule: Customer > Peer > Provider

Reasoning:

Example:

AS 100 receives route to 192.168.1.0/24 from:
  - Customer AS 200: LocalPref = 200
  - Peer AS 300: LocalPref = 100
  - Provider AS 400: LocalPref = 50

AS 100 chooses Customer route (highest LocalPref)

Export Rules (Which routes to advertise to whom):

Golden Rule: Do NOT provide free transit

Learn fromExport to Customer?Export to Peer?Export to Provider?
Customer✅ Yes✅ Yes✅ Yes
Peer✅ Yes❌ No❌ No
Provider✅ Yes❌ No❌ No

Reasoning:

Example:

AS 100 learns route from Peer AS 300
  → AS 100 does NOT advertise to Peer AS 400 or Provider AS 500
  → Otherwise AS 100 would carry transit traffic between AS 300 and AS 400/500 for free

Hot Potato Routing

Definition: Choose the closest exit point from your network to hand off traffic, regardless of the total path length.

Goal: Minimize cost by reducing distance traffic travels within your network

How It Works:

Example:

AS 100 has two routers that can reach AS 200:
  - Router A: IGP cost from source = 5
  - Router B: IGP cost from source = 10

AS 100 chooses Router A (hot potato: get rid of traffic ASAP)

Even if Router B → AS 200 is faster externally, Router A is chosen!

Trade-off: Optimizes your costs, but may not optimize end-to-end performance.


IXPs (Internet Exchange Points)

Definition: Physical locations where multiple networks interconnect to exchange traffic

Examples: DE-CIX (Frankfurt), AMS-IX (Amsterdam), Equinix (global)

Benefits:

How They Work:

Modern Trend: Topological Flattening:


Router Architecture

Control Plane vs Forwarding Plane

Control Plane (The Brain):

Forwarding Plane (The Muscle):

Key Separation: Allows innovation in control plane (software-defined networking) without changing hardware.


Router Components

1. Input Ports:

2. Switching Fabric:

3. Output Ports:

4. Routing Processor:


The Lookup Problem: Longest Prefix Match (LPM)

Challenge: IP forwarding uses CIDR (Classless Inter-Domain Routing), not exact matches.

Example Forwarding Table:

Prefix               Next Hop
192.168.1.0/24       Port 1
192.168.0.0/16       Port 2
0.0.0.0/0 (default)  Port 3

Packet arrives: Destination = 192.168.1.50

Matches:

Rule: Choose the longest prefix match → 192.168.1.0/24 → Port 1

Why Not Caching?

Need: Fast algorithm to handle millions of lookups per second


Trie-Based Lookup Algorithms

Goal: Efficiently find the longest matching prefix

1. Unibit Trie:

Structure:

Example:

Prefixes:
  1*    (all addresses starting with 1)
  11*   (all starting with 11)
  101*  (all starting with 101)

Trie:
        Root
       /    \
     0*      1*   ← Match "1*"
            /  \
          10*  11* ← Match "11*"
          /
        101*       ← Match "101*"

Lookup for IP 10110…:

Advantages:

Disadvantages:


2. Multibit Trie:

Idea: Instead of checking 1 bit at a time, check k bits at a time (stride = k)

Example with stride = 2:

Check 2 bits per step:
  00, 01, 10, 11 (4 branches per node)

For 32-bit address:
  32 / 2 = 16 steps (instead of 32)

Trade-off: More memory for fewer lookups

Problem: Prefixes might not align with stride boundaries

Solution: Controlled Prefix Expansion:

Example:

Original Prefixes:
  1*   → Port A
  11*  → Port B

Stride = 2 expansion:
  10*  → Port A  (expansion of 1*)
  11*  → Port B  (already length 2)

Memory Cost:

Typical Design:


Traffic Engineering

Definition: Optimizing network resource utilization by controlling how traffic flows through the network

Framework:

1. Measure:

2. Model:

3. Control:

Techniques:


Summary

Key Takeaways

  1. Routing vs Forwarding: Routing computes paths (control plane), forwarding moves packets (data plane)

  2. Intradomain Routing:

    • Link State (OSPF): Global knowledge, Dijkstra, fast convergence, hierarchical
    • Distance Vector (RIP): Local knowledge, Bellman-Ford, slow convergence, count-to-infinity problem
  3. Interdomain Routing (BGP):

    • Policy-driven, not distance-driven
    • Business relationships dictate routing (Customer > Peer > Provider)
    • Export rules prevent free transit
    • Hot potato routing minimizes costs
  4. Router Architecture:

    • Control plane (software) vs Forwarding plane (hardware)
    • Longest Prefix Match for IP lookup
    • Multibit tries trade memory for speed
  5. Internet Ecosystem:

    • Hierarchical (Tier-1, Tier-2, Tier-3)
    • IXPs enable local interconnection
    • Topological flattening due to CDNs

Common Patterns

Protocol Selection:

Scalability:

Trade-offs:


See Also

Next: 03-Transport-Layer