- Starter AI
- Posts
- Deep Dive: Building GPT from scratch - part 6
Deep Dive: Building GPT from scratch - part 6
learning from Andrej Karpathy
Hello and welcome back to the series on Starter AI. It’s Miko again, writing from the rainy London.
Today, we’re going on a side quest to understand the backpropagation better. This is the most maths-y part of the series so far - you’ve been warned! You’re going to need at least a couple of hours for this one, so brace yourself. Let’s go!
The roadmap
The goal of this series is to implement a GPT from scratch, and to actually understand everything needed to do that. We’re following Andrej’s Zero To Hero videos. If you missed a previous part, catch up here:
Neural Networks & Backpropagation part 1 - 2024/02/09
Neural Networks & Backpropagation part 2 - 2024/02/16
Generative language model - bigrams - 2024/02/23
Generative language model - MLP - 2024/03/01
Generative language model - activations & gradients - 2024/01/08
To follow along, subscribe to the newsletter at starterai.dev. You can also follow me on LinkedIn.
Generative language model - backpropagation
Today’s lecture is called “Building makemore Part 4: Becoming a Backprop Ninja”. Just like the name implies, we’re taking one more detour before finishing up makemore, to build a better understanding of how backpropagation works.