Episode #459 from 4:30:21
Programming and AI
Well, what do you think about the programming context? So software engineering, that's where I personally, and I know a lot of people interact with AI the most. There's a lot of fear and angst too from current CS students, but that is the area where probably the most AI revenue and productivity gains have come, right? Whether it be Copilots or Cursor or what have you, or just standard ChatGPT. I know very few programmers who don't have ChatGPT and actually many of them have the $200 tier because that's what it's so good for. I think that in that world, we already see it like SWE-bench. And if you've looked at the benchmark made by some Stanford students, I wouldn't say it's really hard, but I wouldn't say it's easy either. I think it takes someone who's been through at least a few years of CS or a couple years of programming to do SWE-bench, well, and the models went from 4% to 60% in a year, and where are they going to go to next year? It's going to be higher. It probably won't be a hundred percent because again, that nines is really hard to do, but we're going to get to some point where that's, and then we're going to need harder software engineering benchmarks and so on and so forth.
Why this moment matters
Well, what do you think about the programming context? So software engineering, that's where I personally, and I know a lot of people interact with AI the most. There's a lot of fear and angst too from current CS students, but that is the area where probably the most AI revenue and productivity gains have come, right? Whether it be Copilots or Cursor or what have you, or just standard ChatGPT. I know very few programmers who don't have ChatGPT and actually many of them have the $200 tier because that's what it's so good for. I think that in that world, we already see it like SWE-bench. And if you've looked at the benchmark made by some Stanford students, I wouldn't say it's really hard, but I wouldn't say it's easy either. I think it takes someone who's been through at least a few years of CS or a couple years of programming to do SWE-bench, well, and the models went from 4% to 60% in a year, and where are they going to go to next year? It's going to be higher. It probably won't be a hundred percent because again, that nines is really hard to do, but we're going to get to some point where that's, and then we're going to need harder software engineering benchmarks and so on and so forth.