Episode #490 from 1:04:12

How AI is trained: Pre-training, Mid-training, and Post-training

I think this might be a good place to define pre-training, mid-training, and post-training. So, pre-training is the classic training one next token prediction at a time. You have a big corpus of data. Nathan probably also has very interesting insights there because of OLMo 3. A big portion of the paper focuses on the right data mix. So, pre-training is essentially just training across entropy loss, training on next token prediction on a vast corpus of internet data, books, papers and so forth. It has changed a little bit over the years in the sense people used to throw in everything they can. Now, it's not just raw data. It's also synthetic data where people rephrase certain things. So synthetic data doesn't necessarily mean purely AI-made-up data.

February 1, 2026Unknown26 chaptersLex Fridman

People

Nathan Lambert Sebastian Raschka

Topics

Artificial Intelligence AGI Robotics

Open full episode More from Lex Fridman Podcast Read transcript

Why this moment matters

Starts at 1:04:12

Artificial Intelligence AGI Robotics

People and topics

People

Nathan Lambert Sebastian Raschka

Topics

Artificial Intelligence AGI Robotics

All moments