Episode #416 from 2:18
Limits of LLMs
At this moment of rapid AI development, this happens to be somewhat a controversial position, and so it's been fun seeing Yann get into a lot of intense and fascinating discussions online as we do in this very conversation. This is the Lex Fridman podcast. To support it, please check out our sponsors in the description. And now, dear friends, here's Yann LeCun. You've had some strong statements, technical statements about the future of artificial intelligence throughout your career actually, but recently as well, you've said that autoregressive LLMs are not the way we're going to make progress towards superhuman intelligence. These are the large language models like GPT-4, like Llama 2 and 3 soon and so on. How do they work and why are they not going to take us all the way? For a number of reasons. The first is that there is a number of characteristics of intelligent behavior. For example, the capacity to understand the world, understand the physical world, the ability to remember and retrieve things, persistent memory, the ability to reason, and the ability to plan. Those are four essential characteristics of intelligent systems or entities, humans, animals. LLMs can do none of those or they can only do them in a very primitive way and they don't really understand the physical world. They don't really have persistent memory. They can't really reason and they certainly can't plan. And so if you expect the system to become intelligent just without having the possibility of doing those things, you're making a mistake. That is not to say that autoregressive LLMs are not useful. They're certainly useful, that they're not interesting, that we can't build a whole ecosystem of applications around them. Of course we can, but as a pass towards human-level intelligence, they're missing essential components.
People
Why this moment matters
At this moment of rapid AI development, this happens to be somewhat a controversial position, and so it's been fun seeing Yann get into a lot of intense and fascinating discussions online as we do in this very conversation. This is the Lex Fridman podcast. To support it, please check out our sponsors in the description. And now, dear friends, here's Yann LeCun. You've had some strong statements, technical statements about the future of artificial intelligence throughout your career actually, but recently as well, you've said that autoregressive LLMs are not the way we're going to make progress towards superhuman intelligence. These are the large language models like GPT-4, like Llama 2 and 3 soon and so on. How do they work and why are they not going to take us all the way? For a number of reasons. The first is that there is a number of characteristics of intelligent behavior. For example, the capacity to understand the world, understand the physical world, the ability to remember and retrieve things, persistent memory, the ability to reason, and the ability to plan. Those are four essential characteristics of intelligent systems or entities, humans, animals. LLMs can do none of those or they can only do them in a very primitive way and they don't really understand the physical world. They don't really have persistent memory. They can't really reason and they certainly can't plan. And so if you expect the system to become intelligent just without having the possibility of doing those things, you're making a mistake. That is not to say that autoregressive LLMs are not useful. They're certainly useful, that they're not interesting, that we can't build a whole ecosystem of applications around them. Of course we can, but as a pass towards human-level intelligence, they're missing essential components.
People and topics
People