Starter AI
Posts
Deep Dive: Building GPT from scratch - part 6

Deep Dive: Building GPT from scratch - part 6

learning from Andrej Karpathy

Miko
March 15, 2024

Hello and welcome back to the series on Starter AI. It’s Miko again, writing from the rainy London.

Today, we’re going on a side quest to understand the backpropagation better. This is the most maths-y part of the series so far - you’ve been warned! You’re going to need at least a couple of hours for this one, so brace yourself. Let’s go!

The roadmap

The goal of this series is to implement a GPT from scratch, and to actually understand everything needed to do that. We’re following Andrej’s Zero To Hero videos. If you missed a previous part, catch up here:

Neural Networks & Backpropagation part 1 - 2024/02/09
Neural Networks & Backpropagation part 2 - 2024/02/16
Generative language model - bigrams - 2024/02/23
Generative language model - MLP - 2024/03/01
Generative language model - activations & gradients - 2024/01/08

To follow along, subscribe to the newsletter at starterai.dev. You can also follow me on LinkedIn.

Generative language model - backpropagation

Today’s lecture is called “Building makemore Part 4: Becoming a Backprop Ninja”. Just like the name implies, we’re taking one more detour before finishing up makemore, to build a better understanding of how backpropagation works.

Subscribe to keep reading

This content is free, but you must be subscribed to Starter AI to continue reading.

Already a subscriber?Sign in.Not now