• Starter AI
  • Posts
  • Deep Dive: Building GPT from scratch - part 5

Deep Dive: Building GPT from scratch - part 5

learning from Andrej Karpathy

Hello and welcome back to the series on Starter AI. I’m Miko, this time writing from Tokyo.

Today, we’re picking up where we left last week, and we’re working on stabilising the neural network using batch normalization, and learning helpful visualizations in the process.

The roadmap

The goal of this series is to implement a GPT from scratch, and to actually understand everything needed to do that. We’re following Andrej’s Zero To Hero videos. If you missed a previous part, catch up here:

  1. Neural Networks & Backpropagation part 1 - 2024/02/09

  2. Neural Networks & Backpropagation part 2 - 2024/02/16

  3. Generative language model - bigrams - 2024/02/23

  4. Generative language model - MLP - 2024/03/01

  5. Today: Generative language model - activations & gradients

To follow along, subscribe to the newsletter at starterai.dev. You can also follow me on LinkedIn.

Generative language model - activations & gradients

Today’s lecture is called “Building makemore Part 3: Activations & Gradients, BatchNorm”, and it builds from where we left last week.

Subscribe to keep reading

This content is free, but you must be subscribed to Starter AI to continue reading.

I consent to receive newsletters via email. Terms of Use and Privacy Policy.

Already a subscriber?Sign In.Not now