Episode #490 from 1:58:11
Advice for beginners on how to get into AI development & research
I was wondering if we could take a bit of a tangent and talk about education and learning. If you're somebody listening to this who's a smart person interested in programming and interested in AI, I presume building something from scratch is a good beginning. Can you just take me through what you would recommend people do? I would personally start, like you said, by implementing a simple model from scratch that you can run on your computer. The goal of building a model from scratch is not to have something you use every day for your personal projects. It's not going to be your personal assistant replacing an existing open-weight model or ChatGPT. It's to see exactly what goes into the LLM, what exactly comes out of the LLM, and how pre-training works on your own computer. And then you learn about pre-training, supervised fine-tuning, and the attention mechanism.
Why this moment matters
I was wondering if we could take a bit of a tangent and talk about education and learning. If you're somebody listening to this who's a smart person interested in programming and interested in AI, I presume building something from scratch is a good beginning. Can you just take me through what you would recommend people do? I would personally start, like you said, by implementing a simple model from scratch that you can run on your computer. The goal of building a model from scratch is not to have something you use every day for your personal projects. It's not going to be your personal assistant replacing an existing open-weight model or ChatGPT. It's to see exactly what goes into the LLM, what exactly comes out of the LLM, and how pre-training works on your own computer. And then you learn about pre-training, supervised fine-tuning, and the attention mechanism.