LIVE CLIPS
EpisodeĀ 8-8-2025
Does them and it does it in like kind of a scarily good way. A good example is like, we've been hiring people. I think we have some hires in the line. We make them do a case study that's like, not as case study is a great way. Okay. And I was so pissed and annoyed by some of the case study qualities. I made like each of the models do the case study. Dude, you know what did the best case study of them all? Claude code did okay. Claude Code over OpenAI agent. And I think that that's kind of what we're like those extremely long contexts. Autonomous ability to do stuff on. On its own is going to be the like, you know, the nirvana that like changes everything or makes the consumption a lot bigger. And I think you can kind of see the glimpse of what an agentic feature looks like via cloud code. And then you just assume that instead of them rl ing the shit out of like software and making, you know, the best, you know, the best commit or something like that, they're going to rl the hell out of like advertising or creating the best, you know, advertising media or they're going to rl the crap out of like finance or making the best financial. Well, yeah, I mean the thing that stands out to me is like this compounding advantage of like, you're a lab, you're competing with other labs and you have access to the best coding product and you can use it as much, as much as you want forever. And it gets better and then you guys get better and higher output. And I don't think we've seen this. You know, you don't. You didn't see the same dynamic with like Microsoft having a better version of Excel and like. Yeah. Or Word or the example of like, it's not like Mark Zuckerberg with Facebook was like, I get to use Facebook more than my competitors, so I'm going to be better. It's like. Well, they actually tried that they were using Facebook for internal comms. They still did well. Yeah. But whether or not that gave them a compounding advantage over teams or. Well, let's. Yeah, like, I think there's this like project or it's like AI 2027 or something. Yeah, yeah. I think that that actually feels like a little bit more grounded and then like some of the like, the really like know AI doomerism. But I think the like, the recursive ability or like whatever, like P. Doom. Whatever you want to talk about, but like the recursive ability to make your product better by continuing to invest is like, you know, it is a flywheel. And that's definitely, I think, the anthropic bet that they're going really hard at. And so, you know, the thought process is, is like, well, okay, if, if we can get the AI agent to do a, you know, AI AI experiments, not just coding experiments, then like, boom, we are off to the races. And that really will be, I think, the, you know, the, the like where the curve bends back in on itself. Tbd. Right. Yet to be seen. I don't like, I don't exactly like from the outside looking in, there isn't exactly any, any kind of special ability for me to say that it will or will not be that way. But I think so far, from what I understand for the researchers, I don't think everyone's like, doomed or beared up on. On some of the stuff. I definitely think ChatGPT5 is a little disappointing. But maybe, maybe open AI just isn't cooking like it used to. Right. Like, I think. And I don't know if, I don't know if it's a. Like, I don't know what actually happened at 5, but like, I want to, like, we don't actually know what's the, like, the ratio. Well, you have to reasoning to like. Just totally just remember. You know, you can debate on the people that have left OpenAI in the last couple months, like, how good they were. Were they the best people? Were they mercenaries? But any company that has been gearing up for a massive product launch, like, or have gone through a massive product launch, imagine going through that again. But you lost like 40% of some of your most elite team members. Like, that could be. That's like, should obviously have been a factor here. If it wasn't, then Zuck is cooked because he just hired a bunch of people that just weren't that great. Someone's cooked. So I think that, I think that, yeah. That it's worth asking the question, what would Chad, what would GPT5 have looked like if Elliot was still at the company? What would it have looked like if Mira was still at the company? What would it have looked like if the long tail of researchers that left were still there? What would it look like if they just didn't have the distraction of the talent war? Right. Yeah, I, I think that's a valid question because my understanding of like, why Gemini 2 randomly was so good is like, Noam Shazir was back. Yeah, that's it. Like, that's literally it. As far as I understand, like, all of a sudden, Gemini starts cooking again. It's like, yeah, because, like, the guy who invented half of everything is back. I'm not. I'm not surprised that that's, like, a real dynamic, but, yeah, we're gonna have to see. We're gonna have to see. I. I do think you're right. Vibes are. Are interesting. I do think probably attracting that cohort, like, if you think about it, just, like, an incremental slosh, like, you know, that's either the biggest, best investment of all time or going to be the worst investment of all time. And, like, tracking that cohort and how that works out is going to be, like, a really interesting, like, case study. Can you. Can you give us an overview of what's happening in private credit headline this week? Obviously, that Meta had tapped Blue Alley and Pimp.
Models through memorization in domains where we can generate lots of synthetic but real data and then non zero fluid intelligence emerges from the resulting chain of thought knowledge recomposition system that sits on top of the foundation model. But it still reminds me of pre training scaling where we were making AI systems better through imitation, learning and stuffing more into them versus an AI system that is capable of cold starting itself in some new domain it's never seen before. And that's why ARC AGI is so important because because the final evals are so hidden behind that secret test set. The AI systems need to be able to cold start themselves when they run into a game that they've never seen before in the test environment in the ARC AGI private eval set. And that's why they fall flat on their face consistently and they're sitting around 1615.8% success rate on on ARC AGI 2 which can be solved by any human pretty easily. But Elon says Grok 5 will be out before the end of the year and it will be crushingly good. He's benchmaxing. He knows ARC AGI is the one to go for and so he's going to be he's going to be rlling on this pretty he's going to have a whole team on rkgi. I'm excited to see what he does. I wonder how he'll do on V3. That will be very very interesting. Anyway, well in other news we have a post here that we can pull up from Bonu Kohli team. It is in the chat if you can pull up he says Yesterday Rail Financial signed a definitive agreement to be acquired by ripple for $200 million. Four years ago I set out on a mission to speed up business to business global payments using USCC. Over the last six months we grew to became to become 10% of B2B global stablecoin settlement volume air horn for that with Ripple we will further accelerate our shared mission. Thank you to our employees, clients, investors and partners for taking an early bet on us. A few that Tarun and I want to call out the entire rail team for their relentlessness and hard work. Avlock of course the CEO of AngelList, our first lead investor and part of the founding team and Gokul Rajaram, immensely helpful during some of our early Crucible moments. And of course Mike over at Galaxy for taking the bet on us in the Series A. We are excited to start our new chapter with Ripple once all regulatory approvals go through. Hit that Gonk John Great contact, great contact. I was lucky to angel invest in the seed round of Rail and this is a fantastic outcome for the team. And yeah, this one was so I first met Bonu, I think back in 2021 or 2022. We were both working on stablecoins at the time. Loved his vision. Haven't stayed super close since then but he's been absolutely cooking and I was very pleasantly surprised when I got the news a couple days ago. So incredible work to the whole Rail team and a great pickup for Ripple. Amazing. Let me tell you about customer relationship. Magic Adeo is the AI native CRM that builds scales and grows your company to the next level. Get started for free. Sam Altman posted. GPT OSS is out. We made a an open model that performs at the O4 mini level. We create our own pronunciation for this. GPT, oos, gptos, GPTs. It runs on a high end laptop. Smaller one runs on a phone. Super proud of the team. Big triumph of technology has community note on it. I don't know what's in there, but that's very funny. Okay. Anyway, but this Donald Boat. Donald Boat. Donald, one of the greatest ever do it. Donald Boat responds. Sam, you, me, the Amalfi coats, me double for nay on the rocks club soda to taste you one delightfully sweet bitter Negroni stirred. 99.1 billion 900 million revolutions counterclockwise. One for each Hertz of the Nvidia 5090 in the gaming PC you will buy and ship to my house. Actually he sent it. I love it. It popped up yesterday. This is a timeline victory hop on Fort Am. Yeah. And yeah, Sam said okay, this was funny. Send me your address and I'll send you a 50 90. And he did it. And I love this. This is the type of like, you know, small ground game that we identified earlier. You know Sam, he didn't have to drop the big long post. He was vague posting. The vague posting was a little mixed result. But this is a win. This is a fantastic win. This just builds the team, builds a lifelong fan. This is hand to hand combat on the timeline and I love to see it. So great, great to be, to be doing this type of stuff even, even the day before GPT5 launch day. And Donald, very cool. Donald Boat is really an account to watch laser boat 999 get in early. Get in early. I mean Bitcoin in 1990. John, he's on 100k still like buying Bitcoin in 1994. That's right, that's right. Or Solana in the 80s. Yep.
He's a grinder. He's a grinder. Even though he. Anyway, you want to stay out of trouble, you want to stay compliant, you got to get on Vanta Automate compliance, manage risk, prove trust continuously. Vanta's trust management platform takes the manual work out of your security compliance process and replaces it with continuous automation, whether you're pursuing your first framework or managing a complex program. Anyway, yesterday GPT5 launched. It was going to be quiet from Gemini, but Sundar Pichai put up a 10k banger on the timeline. Excited to make our best tools free for college students in the United States. Google Gemini is free for students. A one year pro plan offer ends October 6th. They know getting back in school, this is the time to, to get people in the, in the ecosystem. Unlimited image uploads, 2.5 Pro model notebook, LM deep research, 2 terabyte storage. They are pushing people to onboard onto Google Gemini. They are not considering. They're not, they don't think that chat the game's over. They're going to go up, up, head to head with ChatGPT. Yeah. So pull up this post Ben and crew because the chart that people have been sharing. Yes, yes, yes around over the last few days just. And I was asking Greg about this, I was asking some of the other people on the team, did you feel like you got a breather over summer? The GPUs. Oh yeah, yeah that chart. A little bit that chart. Because basically you can see right when summer ended if you scroll down or sorry. Right when summer started usage fell dramatically. Just overall tokens processed and I expect that to tick up pretty dramatically. I don't think that has to do with school though. That, that drop off, that's the European vacation season. These are VCs who use ChatGPT when they're at work but then on summer they're off. Of course. John, it's, it's the VCs where you're. They'Re the primary driver of ChatGPT. They have to ask like what, what, what, what is a foundation model? What, what is a company? How do I invest? Like how can I be helpful just. Dropping the deck in and saying yes. Should I invest yes or no? Yes or no? Exactly. Give me one word answer. Exactly. But if you're in, but if you're in St. Barts or Saint Tropez, you don't need to be using ChatGPT. You're off. You have a vacation responder on. And that vacation responder, it's not generating tokens, it's not hitting the ChatGPT API. It's just a form, it's just a template. It's deterministic computing. It's not stochastic. It's a little throwback. Yeah. Anyway, so the Gemini news is significant because clearly students are incredibly price sensitive. Right. Do you remember being a student? We didn't have Gemini or ChatGPT back in our day, but I remember what were the different websites that would have. That would just help you study for courses. I don't think I ever paid for a single one. But I'd always be using the free tier. And I think generally students are going to continue to be. Even though these tools are so powerful, it's very possible that Gemini can really compete here. Well, you know what else has a free tier? Graphite.dev, code review for the Age of AI. Graphite helps teams on GitHub ship higher quality software. And you can get started for free. At graphite.dev and Graphite CEO coming on. The show later on the show, breaking it down, his his take on GPT5. Speaking of charts, there was a chart burning up the timeline yesterday. The SWEE bench verified software engineering with thinking without thinking. People were very upset about this because the original chart, not this one. Four slides later, the initial, it's from Timo Springer. Timo said this is the correct one. So people are saying like it was a chart crime. And that went on the live stream chart was showing that they were at 74% up here and then the next and then the, the second bar was 69.1% and it was much, much lower. And it made no sense because 52% is of course lower than 69% and the chart just seemed really botched. What was weird is that this, this chart that we're showing here is not a chart crime. It, you know, you could maybe say it doesn't show exponential takeoff, but it's. It shows that with thinking GPT5 beat OpenAI's 03 on Sweetbench. And like, maybe that's, maybe that doesn't matter to you, whatever. But their point is that GPT5 with thinking, if it triggers the thinking, the functionality, it's better at Swabench than O3 and 4.0, which is a good claim to make. Right. But people were upset about the chart crime. What's weird is that like, it really seemed like it was some sort of translation problem because this exact image went up on the website the same time as livestream. The chart was correct on the website, but wrong in the live stream.