5-Day Free Course · Systems & Hardware

Computer Architecture How the Machine Really Works

Understanding computer architecture makes you a better programmer, a better AI engineer, and a better systems designer. This course covers CPU design, memory hierarchy, caches, pipelines, and GPU architecture — the hardware foundation that every software system runs on.

Start Day 1 → See Syllabus

5 days self-paced

Free forever

Text + external video refs

No signup required

Days

10+

Architecture Diagrams

External Videos

Forever Free

How This Course Works

No videos. On purpose.

This is a text-first course that links out to the best supporting material on the internet instead of trying to replace it. The goal is to make this the best course on computer architecture you can find — even without producing a single minute of custom video.

AI-relevant focus on GPU architecture

Day 5 focuses on GPU architecture — why GPUs are used for AI training, how CUDA works, and what VRAM limitations mean for model size.

Visual before technical

Every architecture concept is explained with a diagram before the technical details. The mental model comes first.

Links to Patterson & Hennessy

Computer Organization and Design by Patterson and Hennessy is the canonical textbook. This course links to relevant sections.

Completes in 5 one-hour sessions

Each day is designed to finish in about an hour of focused reading plus worked examples. No live classes, no quizzes.

Syllabus

The 5 Days

Each day stands alone. Read them in order for the full picture, or jump straight to the day that answers the question you have today.

01Day One

CPU Fundamentals

The fetch-decode-execute cycle, ALU, registers, control unit. How a CPU executes instructions. The architectural decisions that determine performance.

fetch-decode-executeALUregisterscontrol unit

→

02Day Two

Memory Hierarchy

Registers, L1/L2/L3 cache, RAM, storage. Latency numbers every programmer should know. Why memory hierarchy design determines real-world program performance.

memory hierarchycache levelsRAMlatency

→

03Day Three

Cache & Performance

Cache hit/miss, direct-mapped vs set-associative, replacement policies, cache coherence. Writing cache-friendly code that runs 10x faster.

cache hitsassociativityreplacement policycache-friendly code

→

04Day Four

Instruction Pipelines

Pipeline stages, hazards (data, control, structural), branch prediction, out-of-order execution. Why modern CPUs are so fast despite clock speed limits.

pipeline stageshazardsbranch predictionout-of-order

→

05Day Five

GPU Architecture

SIMD parallelism, GPU vs CPU design philosophy, CUDA cores, VRAM, tensor cores. Why GPUs are used for AI training and what hardware limits model size.

GPUSIMDCUDAVRAM

→

Supporting Videos

The best external videos on this topic.

Instead of shooting our own videos, we link to the best deep-dives already on YouTube. Watch them alongside the course. All external, all free, all from builders who ship this stuff.

YouTube · Search

Computer Architecture Explained

Visual explanations of CPU architecture — the fetch-decode-execute cycle, ALU, and how instructions run.

YouTube · Search

Memory Hierarchy and Cache

How memory hierarchy works — L1/L2/L3 cache, RAM, and the latency numbers that explain modern program performance.

YouTube · Search

CPU Instruction Pipeline

How instruction pipelining works, the types of hazards, and how modern CPUs achieve high throughput through parallelism.

YouTube · Search

GPU Architecture for AI

How GPUs differ from CPUs architecturally, why they're used for AI training, and what VRAM limits mean for model size.

YouTube · Search

Branch Prediction

How CPUs predict branches to avoid pipeline stalls — and the Spectre/Meltdown implications of speculative execution.

YouTube · Search

Computer Architecture Textbook

Walkthrough content based on Patterson and Hennessy's Computer Organization and Design — the canonical textbook.

Open-Source References

Read the source.

The best way to go deeper on any topic is to read canonical open-source implementations. These repositories implement the core patterns covered in this course.

github.com/nicedoc/awesome-computer-science

awesome-computer-science

Curated computer science resources including computer architecture, operating systems, and hardware references.

github.com/NVIDIA/cuda-samples

cuda-samples

NVIDIA's official CUDA sample code — the canonical reference for GPU programming and the architecture Day 5 covers.

github.com/nicedoc/awesome-hardware

awesome-hardware

Curated list of hardware design tools, simulators, and learning resources for computer architecture study.

github.com/riscv/riscv-isa-manual

riscv-isa-manual

The RISC-V instruction set architecture manual. The cleanest ISA design for understanding computer architecture principles.

Who This Is For

Three kinds of people read this.

CS Students Connecting Theory to Practice

You took architecture in school but it felt abstract. This course connects the concepts to the hardware that runs your code every day.

AI Engineers Wanting Hardware Understanding

You train models and want to understand why GPU architecture matters, what VRAM limits mean, and how to write faster GPU code.

Systems Programmers and Performance Engineers

You optimize software and want to understand the hardware constraints you're optimizing against — cache hierarchy, pipeline hazards, memory bandwidth.

Want to Master Systems and Architecture In Person?

The 2-day in-person Precision AI Academy bootcamp covers hardware, systems programming, and AI infrastructure — hands-on with Bo. 5 U.S. cities. $1,490. 40 seats max. June–October 2026 (Thu–Fri).

Reserve Your Seat