Lecture 1: System Overview
Series: X Recommendation Algorithm Deep Dive Audience: MLE Intern Prerequisites: Basic ML knowledge, Python
What Does the “For You” Feed Do?
When you open X (formerly Twitter) and tap the “For You” tab, the system needs to answer one question:
“Out of hundreds of millions of posts, which ~50 should this user see right now?”
This happens in under 2 seconds, for hundreds of millions of users, constantly. The algorithm we’re studying is the engine behind this.
The 4 Components
The codebase has 4 major components. Think of them as a restaurant:
x-algorithm/
+-- candidate-pipeline/ # The recipe framework (reusable patterns)
+-- home-mixer/ # The head chef (orchestrates everything)
+-- thunder/ # The pantry (stores fresh ingredients)
+-- phoenix/ # The secret sauce (ML models) <-- YOUR FOCUS
| Component | Language | What It Does | Restaurant Analogy |
|---|---|---|---|
| candidate-pipeline | Rust | Reusable pipeline framework with traits (Source, Filter, Scorer, etc.) | The recipe book |
| home-mixer | Rust | Orchestration layer: assembles the feed using the pipeline framework | The head chef |
| thunder | Rust | In-memory post store for real-time posts from followed accounts | The fresh ingredients pantry |
| phoenix | Python/JAX | ML models for retrieval and ranking | The secret sauce |
What You (MLE Intern) Care About
candidate-pipeline/ --> Skip (Rust infrastructure)
home-mixer/ --> Understand conceptually (Lecture 6)
thunder/ --> Understand conceptually (Lecture 6)
phoenix/ --> READ EVERY LINE (Lectures 2-5)
The End-to-End Flow
Here’s what happens when you open the “For You” tab:
User opens "For You"
|
v
+------ HOME MIXER (gRPC Server) ------+
| |
| 1. QUERY HYDRATION |
| +-- Fetch user engagement history |
| +-- Fetch following list |
| |
| 2. CANDIDATE SOURCING (parallel) |
| +-- Thunder: recent posts from |
| | accounts you follow |
| +-- Phoenix Retrieval: posts from |
| accounts you DON'T follow |
| (ML two-tower model) |
| |
| 3. CANDIDATE HYDRATION (parallel) |
| +-- Fetch post text, media |
| +-- Fetch author info |
| +-- Fetch video duration |
| +-- Check subscription status |
| |
| 4. PRE-SCORING FILTERS (sequential) |
| +-- Remove duplicates |
| +-- Remove old posts |
| +-- Remove your own posts |
| +-- Remove blocked/muted authors |
| +-- Remove already-seen posts |
| +-- Remove muted keywords |
| +-- ...6 more filters |
| |
| 5. SCORING (sequential) |
| +-- Phoenix Scorer: ML predicts |
| | 19 engagement probabilities |
| +-- Weighted Scorer: combine |
| | 19 probs into 1 score |
| +-- Author Diversity: penalize |
| | repeated authors |
| +-- OON Scorer: boost in-network |
| |
| 6. SELECTION |
| +-- Sort by score, pick top K |
| |
| 7. POST-SELECTION FILTERS |
| +-- Visibility filtering |
| +-- Conversation dedup |
| |
| 8. SIDE EFFECTS (async) |
| +-- Cache results |
| |
+---------------------------------------+
|
v
Ranked Feed Response
The ML Pipeline (Your Focus)
Within Phoenix, there are two ML stages:
Stage 1: Retrieval (Two-Tower Model)
Problem: Hundreds of millions of posts exist. Can’t score them all. Solution: Use a cheap model to narrow down to ~1000 candidates.
User Tower (Transformer) Candidate Tower (MLP)
| |
v v
[user embedding] [post embeddings] (pre-computed)
| |
+---> dot product similarity <----+
|
v
Top 1000 posts
Stage 2: Ranking (Transformer with Candidate Isolation)
Problem: Need to carefully score ~1000 candidates. Solution: Use a powerful transformer that predicts 19 engagement types.
[User + History + Candidates] --> Transformer --> 19 probabilities per candidate
|
v
P(like) = 0.85
P(reply) = 0.12
P(repost) = 0.34
P(block) = 0.001
...15 more actions
Key Design Insight: The model predicts 19 different actions (like, reply, repost, click, block, mute, report, etc.) rather than a single “relevance” score. These are combined with configurable weights downstream. This means X can tune the feed behavior (e.g., prioritize replies over likes) without retraining the model.
The Phoenix Directory (10 Python Files)
phoenix/
+-- grok.py # Transformer architecture (from xAI's Grok-1)
+-- recsys_model.py # Ranking model + data structures
+-- recsys_retrieval_model.py # Two-tower retrieval model
+-- runners.py # Training/inference runners
+-- run_ranker.py # Demo: run ranking inference
+-- run_retrieval.py # Demo: run retrieval inference
+-- test_recsys_model.py # Tests for ranking
+-- test_recsys_retrieval_model.py # Tests for retrieval
+-- pyproject.toml # Python dependencies
+-- README.md # Documentation
Reading order for this lecture series:
recsys_model.py(data structures, hash embeddings) —> Lecture 2recsys_retrieval_model.py(retrieval) —> Lecture 3recsys_model.py(ranking model) —> Lecture 4grok.py(transformer internals) —> Lecture 5runners.py(inference pipeline) —> Lecture 6
Tech Stack
| Component | Technology | Why |
|---|---|---|
| Phoenix | Python + JAX + Haiku | ML models, TPU training |
| Home Mixer | Rust + Tonic (gRPC) | High-throughput serving |
| Thunder | Rust + Kafka | Sub-ms in-memory lookups |
| Candidate Pipeline | Rust | Composable, type-safe framework |
| Communication | gRPC + Protobuf | Cross-service RPC |
JAX vs PyTorch (For Your Reference)
This codebase uses JAX/Haiku. You’re used to PyTorch. The core difference:
PyTorch: model = MyModel() # model carries weights inside
output = model(x) # call it like an object
JAX: forward_fn = hk.transform(forward) # wrap pure function
params = forward_fn.init(rng, x) # extract weights
output = forward_fn.apply(params, x) # pass weights explicitly
The math is identical. Throughout this series, we’ll provide PyTorch translations for every piece of JAX code.
The 19 Predicted Actions
The ML model predicts probability of 19 user engagement types:
# From runners.py:202-222
POSITIVE ENGAGEMENT:
1. favorite_score # Will they like it?
2. reply_score # Will they reply?
3. repost_score # Will they repost?
4. photo_expand_score # Will they expand the photo?
5. click_score # Will they click the link?
6. profile_click_score # Will they visit the author's profile?
7. vqv_score # Will they watch the video (quality view)?
8. share_score # Will they share it?
9. share_via_dm_score # Will they share via DM?
10. share_via_copy_link_score # Will they copy the link?
11. dwell_score # Will they spend time reading?
12. quote_score # Will they quote-tweet?
13. quoted_click_score # Will they click a quoted tweet?
14. follow_author_score # Will they follow the author?
NEGATIVE ENGAGEMENT:
15. not_interested_score # Will they mark "not interested"?
16. block_author_score # Will they block the author?
17. mute_author_score # Will they mute the author?
18. report_score # Will they report the post?
CONTINUOUS:
19. dwell_time # How long will they spend on it?
These are combined into a single score by the Weighted Scorer (Lecture 6).
Quick Check
Before moving to Lecture 2, make sure you can answer:
- What are the two ML stages in the recommendation pipeline?
- Why does the model predict 19 actions instead of 1 relevance score?
- Which directory should you focus on as an MLE intern?
- What’s the difference between Thunder (in-network) and Phoenix Retrieval (out-of-network)?