Episode #416 from 1:11:30

Reasoning in AI

The type of reasoning that takes place in LLM is very, very primitive, and the reason you can tell is primitive is because the amount of computation that is spent per token produced is constant. So if you ask a question and that question has an answer in a given number of token, the amount of computation devoted to computing that answer can be exactly estimated. It's the size of the prediction network with its 36 layers or 92 layers or whatever it is multiply by number of tokens, that's it. And so essentially, it doesn't matter if the question being asked is simple to answer, complicated to answer, impossible to answer because it's a decidable or something, the amount of computation the system will be able to devote to the answer is constant or is proportional to number of token produced in the answer. This is not the way we work, the way we reason is that when we're faced with a complex problem or a complex question, we spend more time trying to solve it and answer it because it's more difficult. There's a prediction element, there's an iterative element where you're adjusting your understanding of a thing by going over and over and over, there's a hierarchical elements on. Does this mean it's a fundamental flaw of LLMs or does it mean that-

Why this moment matters

The type of reasoning that takes place in LLM is very, very primitive, and the reason you can tell is primitive is because the amount of computation that is spent per token produced is constant. So if you ask a question and that question has an answer in a given number of token, the amount of computation devoted to computing that answer can be exactly estimated. It's the size of the prediction network with its 36 layers or 92 layers or whatever it is multiply by number of tokens, that's it. And so essentially, it doesn't matter if the question being asked is simple to answer, complicated to answer, impossible to answer because it's a decidable or something, the amount of computation the system will be able to devote to the answer is constant or is proportional to number of token produced in the answer. This is not the way we work, the way we reason is that when we're faced with a complex problem or a complex question, we spend more time trying to solve it and answer it because it's more difficult. There's a prediction element, there's an iterative element where you're adjusting your understanding of a thing by going over and over and over, there's a hierarchical elements on. Does this mean it's a fundamental flaw of LLMs or does it mean that-

Starts at 1:11:30
People and topics
All moments
Reasoning in AI chapter timestamp | Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | EpisodeIndex