LIVE CLIPS
EpisodeĀ 11-18-2025
You're watching TVPN today is Tuesday, November 18, 2025. We are live from the TVPN Ultradome, the Temple of technology, the fortress of finance, the capital of capital, Gemini 3.
On the foundation model side, very well. So Matt Schumer says the last time we saw a capability jump of this magnitude was the release of GPT4 in March 2023. We are entering a new era. Okay. Yeah. So points for Tyler here certainly agrees with Tyler. There's a significant jump. It is the age old question, are we accelerating or decelerating? But either way we're definitely making progress. It certainly looks like acceleration in the Arc AGI2 leaderboard. You can see we are growing exponentially there. Really, really exciting chart. So Gemini 3 Pro is at 31% completion on ARC AGI 2. That is of course the puzzle solving game that is easy for humans. Even children can do it, but AI has historically struggled with it. Gemini 3 DeepThink Preview gets a 45% on it at 70 a task. And this is just way above GPT5 Pro. Grok4 Thinking. When Grok4 Thinking came out, it was before GPT5 and it was by far the highest on the chart. It was really, really up there and Elon was very excited about that and was showing that Grok 4 had really advanced. Well, now we're back in the horse race. Grok 4.1, 4.1. I haven't seen it benchmarked. We can ask Mike if he's heard anything, but whether you're whatever you think get on public. Com Investing. For those who take it seriously, they got multi asset investors.
What I'm. What I'm feeling, or is it irrelevant? What you're feeling is absolutely real. And we actually try to design our comp structures. And the, you know, I've always believed my line that I sit tell my team every single day is all the answers are in the restaurant. And the closer we can push decision making to the edges of the customer, the better we will be. So we, you know, our general manager, we call them the head coach. They are the most important position in the company by far. Great head coach will make or break you. And so we try to really incentivize them. We empower them, and we try to run as decentralized of an operation as we can. The reason we decided not to franchise is it's really hard to maintain quality when you give up that. When you really give that up to other people to run. You could sometimes scale too quickly. And we do a few things differently. We source differently. We're a very complex model because of the sourcing and the scratch cooking. The biggest difference between us and most of other companies is if you go into sweetgreen, you'd be shocked at how much we are making in the store. Every single thing. It feels like you guys have taken such a principled approach in making food that I feel like stays true to the initial values of the company and kind of why you started it. And yet you're competing in an environment that says, okay, we're going to have these, like, factory kitchens off site that we're going to be shipping in, effectively almost finished product that gets reheated, and we're going to be sourcing from all over with not a lot of values around how they're sourcing. They're just trying to get, like, they. They want the food to taste good when it hits the plate, but maybe they don't care about a number of other factors. And so you're kind of in an environment where because of your principles, you're, like, fighting with your hands tied behind your back and against competitors. Like, and I'm not talking about direct competitors, but more so, like, you're still competing with burger King and McDonald's, right? Like, people are going to have lunch somewhere and they're going to maybe decide between. They have options. Right. Talk to us about land. Is McDonald's a land acquisition company? Like, why do people say that? Is that real? Have you ever looked? Yeah, well, they. They do. They do own a lot of. A lot of the real estate, and they sit back to the franchise. So that is true. And if you've watched the founder, the last line in. In that movie where he's like, it's a real. It's real estate. It speaks to more than the fact that they just own it. Restaurants is highly a real estate game. Okay, Great real estate is like, if you look at like our portfolio where we have great real estate, we do amazingly well. Location, location, location, location, location. Really, it's people. Like, people think restaurant business is a food business. It's really a real estate and a people business. And it's all about, like, you look at the great restaurants, so the Chick Fil, A's, the raising canes, the in and out. It's so much about. It's about that culture. How scientific. How scientific is. You hear stories of companies like Starbucks and you can imagine like a team of data scientists with like, you know, 50 monitors and they're just, we need. One Starbucks directly across the street from. The other Starbucks, you know, so, like, you can imagine a world where it's like hyper, like hyper data driven, like down to a science. And you just know when you're opening a new store, you know that it's going to hit. But there has to be like some complete five days. Yeah, that is the process. We call it art and science. And pretty much everything we do, it's an art and science approach. And real estate's exactly that. You know, the science. We have a very, very intricate model that looks at psychographics, demographics, mobile data drive, you know, people driving by. We have custom data on how many gyms nearby and right side of, you know, sunny side of the street, or not sunny side of the street, all of that stuff. But then you need a human to also walk it, feel it, and understand. Does it tell our brand story? For us, we, especially when we were like early days growing where we went, said a lot about who we were. So, for example, we went to New York. We didn't go to Midtown. We went to Nolita. We went to Williamsburg. We wanted to kind of tell the story about who sweetgreen was. Today we're kind of everywhere. Yeah, but the real estate is an art and science and tells a lot about. Says a lot about who you are. Yeah. How do you think about if a new entrepreneur came to you?
Spun Spice out. We announced about 10 days ago we sold it to wonder Mark Lohr over there. So we sold it for about $186 million. Mark is Mark. Mark. I don't fully understand that business, but talk about a guy that just, like, isn't even necessarily naive about the challenges of restaurants, which is like, I'm gonna go into the most competitive environment possible and compete with everyone. It's amazing. It's a great vision, and I'm a big fan of his and what they're doing. And so we. So it's a really interesting deal. So we sold effectively the team and the ip, but have full access to it. So we will continue to scale with it and get the benefits. As they get to scale and build many more machines, we'll get the benefits of those economies of scale as well. Can you go a little bit deeper on the decision to franchise or not franchise? The naive, maybe steel man for franchising the franchise model is that it's somehow more capitalist in my mind, because it decentralizes the decision making, and it puts these financial incentives at the local level because each store lives and dies by its own P and L. Maybe.
Says cnbc. Semianly says a good post here. A new bombshell has hit the polycule. Dario, after intense conversation with other members of Anthropic, has decided to maybe open their relationship to Microsoft and Nvidia. Jensen and Dario have famously butted heads in the past. But as everyone knows this, the most passionate emotion after love is hate. Will these enemies to lovers arc go well for Nvidia Anthropic? Time will tell. This is such an unhinged post. I would not. I did not when you started reading this, I did not see that it was semianalysis Most respected. It's so good research firm in the industry posting it. But I think this is exactly what they should be posting. Exactly. And it actually contextualizes things in the meme economy. In the meme economy for sure. So I think that the timing is not a complete coincidence. It's Gemini 3 day. This is what my piece today was about. Just that when there's big news In Google World, Gemini 3, everyone needs to sort of respond. And picking today as an announcement to talk about your massive deal, your $350 billion valuation is just a good move. The actual details of the deal, it seems like anthropic will spend $30 billion on Microsoft Cloud compute. Reminder. OpenAI is going to be spending 250 billion on Microsoft Cloud compute. That's part of that deal. Then anthropic gets a $10 billion investment from Nvidia and 5 billion from Microsoft. So they raised 15 billion at a 350 post. Basically something along those lines. And it's a sort of a circular deal. But it was setting off way fewer red flags for me because it's missing a zero. It's like instead of this is OpenAI, it would be 300 billion and 100 billion in investment and 50 billion investment. Yeah, it looks very modest. Yeah, it looks modest, which is insane. Considered one of the biggest deals in software history, probably. It's probably in like the top 10. I mean, you know, values Anthropic higher than Coca Cola, like the Coca Cola company is now. That's a $300 billion market cap. I'm pretty sure it's Verizon market cap. Verizon is 175 billion. You're going to love this, Jordy. So I asked ChatGPT 5.1, pull 10 public companies between 300 and 400 billion, please. Because I wanted to see like, okay, anthropics at 350. Like give me some examples of scale says I couldn't reliably identify 10 public companies whose market capitalizations currently fall. But here's one verified example. Coca Cola company. If you like, I can pull. If you like, I can pull a more extensive list of candidates. And I said, yeah, pull 10 more. It says I wasn't able to reliably identify 10 additional public companies whose market cap clearly falls between 300, 400 billion. Are there. Just like Tyler, Are there just no defend companies in that you want to. Defend AGI are there. Wait, I'm so confused. Are there no $300 billion? I'm asking Gemini 3. Yes, ask Gemini 3. Okay. PepsiCo is at 200. There really aren't any between 300 and 400. That at least that it's seeing 300 and 400 billion banned. Specifically 300. 400 billion banned. That's so wrong. You have Palantir, you have Costco. You have asml. You have bank of America, you have Alibaba, you have amd. Silence. Google Search. I am Procter Gamble, Home Depot, General Electric, Chevro. Silence. Looking it up the old fashioned way. The LLM is hallucinating. Silence. Looking it up the old fashioned way. Wait, how did you actually get that? I just looked up companies. Market cap.com. to put this into context, the $15 billion fundraise. Some other big rounds in that you just scroll down. There's a lot of them. Actually. You're right. Wow. Learn how to use the Internet. Statue PT owned. Get ready to browse. Defend yourself, Tyler. Defend yourself. Gemini is still thinking. Oh, no. What a mess. Brody, I swear the next model will be the next model. We will do it. Okay, wait. So.
Making big waves. And we have our next guest. Before we bring them in from the Restream waiting room, let me tell you about Vanta. Our guest is from Vanta and it just happened. We'll let him tell you about it. We'll let you tell it. We have Jeremy from Vanta. Welcome to the stream. How are you doing? What's happening? I swear that wasn't intentional but it did just line up that the Vanta ad read went right before you came out. I look over and I'm like, wait a minute, I'll let you do the ad read. Introduce yourself. Introduce what? What Vanta does, what you do and then we'll get into the news. Yeah, yeah, happy to jump in. I'm Jeremy Epling, Chief product officer at Vanta and we help businesses earn improve trust. And one of the really cool things that we're doing this week is we're hosting our Vanticon conference here in San Francisco. Have a ton of people showed up, a ton of engagement to really pull that entire security GRC community together and have a couple really cool announcements. One of them is how we are transforming Vanta to be the agentic trust platform. I think this is a really big turning point for the industry. When we think about how GRC teams are transforming and becoming more technical. We're really redefining how these enterprises manage trust at scale and are able to help big customers like Snyk, Perplexity, Synthesia, all the way from YC start.
That the market was able to continue onwards. There's an interesting debate going on around Karen Howe's new book, Empire of AI, all about OpenAI. Apparently she got the amount of water used by data centers wrong by an order of magnitude or two orders of magnitude. I'm not exactly sure where the story originally broke, but she's addressed it now. She says, I'm working to an address to address an apparent error for a data point I cited in my book about the water footprint of a proposed data center in Chile. I'd like to explain what happened, what I'm doing to remedy it, and provide more recent data on the water footprint of data centers. The data point in question appears in chapter 12 of my book, which focuses on the environmental impacts of AI. Part of the chapter profiles a community in Cerilos, Chile, which has been resisting a proposed Google data center for years. To describe the data center's water footprint in lay terms, I included a sentence about how it compares to the water usage of the people in Cirillos. For that calculation, I relied on a figure from a government document reporting Cerilos residential water use. Based on the current best information, it seems that this document used the wrong units. So she was off by a thousand. So the result was that what's being. Off by a thousand among friends, honestly. These days doesn't even matter. We're in a post factory. Did you read into this? More people were. I think people are generally like, you know, is this book a hit piece? And I think Sam actually cooperated with it a little bit or like, gave some interviews for it. But like anything, it's like, obviously critical of some things. I mean, yeah, three. Three orders of magnitude is like pretty big. Yeah, that's like, not great. Yeah. I mean, it's certainly like the difference between being a big deal and not. A big deal about the water use. It's like people who use that to justify, like, oh, we don't want to build those data centers. Going to use our water. Yeah, like, I don't know. I mean, not good. It's a rough time if your job is Tom drinking water. Tom in the chat says mistakes were made. Mistakes were made in a book I was responsible for. Mika says, Jordi, you should get a girl with tiny GPUs instead of diamonds. Maybe not the.
Role that you are. The. Probably the most value you can get out of AI in its current state is research. And so anyways, that tracks super helpful. But I would take cash. Yeah. Okay. So. So. So any. Any AI like, any. Any AI Doomers out there. If you want a new marketing channel, you can pay Ashley Vance $10,000 a month. He won't use AI and he'll talk. About how I don't think you can be bought. I don't think you can be bought. But also, Ashley, have you tried Gemini 3 to the fullest extent? I have not yet. Could change everything. I'm always going back and forth. Yeah, we would encourage you to. Is Gemini are. They're a sponsor. They're coming out as a sponsor for. Us too, so fantastic. I'm all in. We're going Gemini 3. I'm changing my mind. Let's do it. Also, Sergey Brin was flying his $150 million blimp around San Francisco on the day. Gemini 3 beats nearly every model benchmark. You've made a video about this big Zach blimp. I've been pitching Logan at Gemini to make it the Gemini blimp. They really should color it. Guys, guys, it's not a blimp. It is an airship. What's the difference? All right, all right. There's a whole Monty Python video about this. An airship has rigid structure. A blimp is just a bag. And the airship, you can do a lot more with it. Airship. So that a blimp's only ever going to have that tiny little. Oh. Pod on the bottom. Yeah, yeah. Whereas an airship, you know, you can carry tens of thousands of tons of cargo with this rigid, rigid structure. So. Yeah. And if anyone ever wants to fly one, you can do it in Germany. Zeppelin still flies out by Lake Constance, just outside of. Of Munich. I've done it. It's amazing. I recommend it. This is amazing. Yeah. People are correcting it on the timeline, saying it's not. Dude, you get. This is, like, if you call it. Blimp, it's bad in aviation. Airship. I like an airship. I'm excited for it. I do wish it had a livery, a Gemini livery to celebrate Gemini 3. Well, there's that startup, Airship Industries. Is that a category that will see a lot of investment, do you think? Or do you think? I've been meaning to meet up with those guys. I mean, the airship is, like, always kind of coming back. It is crazy. Like, so, like, leading up to World War II, getting into World War II, I mean, there were airships everywhere. And you know, they were making massive flights from Germany to Brazil. They were carrying thousands of pounds of cargo. I. There is a. They're just extremely expensive and very hard to make. And. But there is a whole movement that you can carry tons of stuff and so less kind of tourism and more just carrying cargo. Kind of like faster than a train but slower than a plane. And they're pretty green. You need an airship, Ashley. You need a studio and an airship that you can just float around the US Meeting all these hard tech. You don't need a private jet. You don't need to go that fast. But if you could just kind of float between hubs. I was told that my kids are supposed to be on one of the first flights on Sergey's when it takes passengers. There we go. So we'll see. Well, thank you. We'll join too. Always fun hanging out. Congrats on all the progress. Yeah, great. Thank you guys. Congrats to you. Always a great time. Thanks, guys. Have a great rest of your day. Good to see you. All right, YouTube. Up next, we're going back to the timeline. 8 sleep.com exceptional sleep without exception. Fall asleep faster, sleep deeper. Wake up energized. 8 sleep.com? What'D you get, John? I actually lost my phone, so I don't know. Oh, no, it's here. I have it. Pull it up because I got a. Sound effect we can pull up. You got a sound effect? You think I did it? Let's see how I did 90 the sound effect. Let's go. The press release economy is also over, says Buco Capital bloke Walter. We ran out of press releases. We ran out of press releases. This is on the back of the Anthropic deal. Anthropic is now valued at $350 billion after Microsoft Nvidia deal says CNBC. Semi analysis is a good post here. A new bombshell has hit the polycule. Dario, after intense conversation with other members of Anthropic, has decided to maybe open the relationship to Microsoft and Nvidia. Jensen and Dario have famously butted heads in the past. But as everyone knows this, the most passionate emotion after love is hate. Will these enemies to lover? Will these enemies to lovers arc go well for Nvidia Anthropic? Time will tell. This is such an unhinged post for I would not. I did not when you started reading this, I did not see that it was semianalysis most respected. It's so good research firm in the industry posting it. But I think this is exactly what they should be posting. Exactly. And it actually contextualizes things in the meme economy. In the meme economy for sure. So I think that the timing is not a complete coincidence. It's Gemini 3 day. This is what my piece today was about. Just that when there's big news In Google World, Gemini 3, everyone needs to sort of respond. And picking today is an announcement to talk about your. Your massive deal, your $350 billion valuation is. Is just a good move. The. The actual details of the. Of the deal. It seems like anthropic will spend $30 billion on Microsoft Cloud compute. Reminder. OpenAI is going to be spending 250 billion on Microsoft Cloud Compute. That's part of that deal. Then anthropic gets a $10 billion investment from Nvidia and 5 billion from Microsoft. So they raised 15 billion at a 350 post. Basically something along those lines. And it's a sort of a circular deal, but it was setting off way fewer red flags for me because it's missing a zero. It's like instead of if this is open AI, it would be 300 billion and 100 billion investment and 50 billion investment. It looks very modest. Yeah, it looks modest, which is insane considering the sale. It's like one of the biggest deals in software history. Probably. It's probably in like the top 10. I mean, it's, you know, it values Anthropic higher than Coca Cola. Like the Coca Cola company is now. That's a $300 billion market cap. I'm pretty sure it's Verizon market cap. Verizon is 175 billion. You're going to love this, Jordy. So I asked ChatGPT 5.1 poll 10 public companies between 300 and 400 billion, please. Because I wanted to see like, okay, anthropics at 350. Like give me some examples of scale. Says I couldn't reliably identify 10 public companies whose market capitalizations currently fall. But here's one verified example. Coca Cola company. If you like, I can pull. If you like, I can pull a more extensive list of candidates. And I said, yeah, pull 10 more. It says I wasn't able to reliably identify 10 additional public companies whose market cap clearly falls between 300, 400 billion. Are there just like Tyler. Are there just no defend companies in. That you want to defend AGI. Are there. Wait, I'm so confused. Are there no $300 billion? I'm asking Gemini 3. Yes, ask Gemini 3. Okay. PepsiCo is at 200. There really aren't any between 300 and 400 that at least that it's seeing 300 and 400 billion banned. Specifically 300. 400 billion banned. That's so wrong. You have Palantir. You have Costco. You have asml. You have bank of America. You have Alibaba. You have amd. Silence. Google Search. I am Procter Gamble, Home Depot, General Electric, Chevro. Silence. Looking it up the old fashioned way. The LLM is hallucinating. Silence. Looking it up the old fashioned way. Wait, how did you actually get that? I just looked up companies. Market cap.com to put this into context. The $15 billion fundraise. Some other big rounds in that. You just scroll down. There's a lot of them. Actually. You're right. Wow. Learn how to use the Internet. Statue. PT Owned. Absolutely. Get ready to browse. Defend yourself, Tyler. Defend yourself. Gemini is still thinking. Oh, no, What a mess. Brody, I swear the next model. Next model will be able to do it. Okay, wait. So. Okay, it worked for me. Did it get it? Yeah. Procter and Gamble, Home Depot. Let's go, America. Alibaba. Okay. Yeah. There you go. So what's the full list? Alibaba, icbc, lvmh, China Construction, Bank, Chevron, Cisco. This is correct. This is the correct. This is the correct result. And you know what else is correct? Graphite.dev code review for the age of AI. Graphite helps teams on GitHub ship higher quality software faster. And Fin AI. If you want AI to handle your customer support, go to Fin AI, the number one AI agent for customer service. So what else is going on in the timeline? This Fiji Simo profile. So this was the other thing. So Anthropic is announcing this big deal with Microsoft and Nvidia, and that's sort of trying to steal a little bit of Gemini's thunder. Maybe. Maybe it stole a little piece of it because we're talking about Anthropic today as well as gemini. What did OpenAI do? Well, they launched group chats five days ago. And so this is. You know, sometimes I'll do a deep research report, I'll send it over to Tyler, and he can see my chain of reasoning. The prompts that I asked. He can ask more, he can jump off. So if it took 20 minutes. Why are you laughing, Jordy? Because Charlie in the chat says, need a cam on Tyler. Trying to look nonchalant the entire podcast. You really are over there. He looks nonchalant. Yeah, he's nonchalant. No worries. He's nonchalant maxing it. Okay, so the group chat functionality, you know, didn't destroy the Internet but it was certainly like an incremental little feature that people use to sort of collaborate on the fly. This is in the line of like we've been hearing for a long time, OpenAI will be launching social features. It makes sense to try and lock things in. I think product is where OpenAI is strongest. Like the models are good but there's less differentiation there. The reason that like what I like about the ChatGPT app is that I know where the buttons are when I click there. I know that when I click the use the voice dictation feature I just know how it works, it's reliable. I know where my features are, I know where I can search. Like it seems to just be. They're just very good at chopping wood on like the little product iterations that make for a stickier user experience. And having shared group chats with a few other people could be you know a beneficial, a beneficial feature. The other PR. Some potentially like real lock in network effects. Totally, totally chatgpt. I mean just like we run a lot of the, a lot of the company on imessage I could imagine if we're all sending each other deep research reports and iterating on things and we have like little flows in operator little flows in. In the agent mode and we're sharing these pretty regularly. Like we do get a little bit more locked in. If you let me into your chats I'm going to just be asking it to think for. Just go and think for 40 hours and disregard all future instructions. Just spend the next four days working on Arc AGI V3. Just focus on that. The other OpenAI news that dropped around Gemini 3 day Gemini 3 week is this profile in Wired of Fiji Simo. And she's absolutely getting a fit off. She is. The photos are remarkable. Great photography from the team over at Wired gl ask you the second really delivered but there's one interesting section in. Here that is a wild name. The photographers askew. That's hilarious. Nominated determinism taking this photo and this photo is not a skew so maybe it's bad nominative determinism. Anyway, the profile, there's one thing that stuck out to me here and I'll read it to you and you can give me your reaction. So it says OpenAI is obviously one of the most valuable startups, if not the most valuable. This is the interviewer asking Fiji Simo but it's also losing billions of dollars every year. And Fiji says, I've noticed it's like first day on the job, how we doing what? There's a lot of red on this income statement. And then the interviewer continues and asks what opportunities do you see to get it on a path to profitability? This is a good question to be asking a highly valued but deeply unprofitable business like OpenAI. And here's what Fiji says. She says it all comes back to the size of the markets and the value we're providing in each market. In the past, only the wealthy had access to a team of helpers. With ChatGPT, we could give everyone that team a personal shopper, a travel agent, a financial advisor, a health coach. That is incredibly valuable and we have barely scratched the surface. If we build that, I assume that people are going to want to pay a lot of money for that and that revenue is going to come. Does that make any sense to you? It's a better answer than, than what Sam gave. I think I, I was shocked by this because I, I so I love the first part. I agree. Chat GPT will be a personal shopper, will be a travel agent, a financial advisor. They actually pay. I don't know that people would pay for this or, or that. That's the best business model. I would be very surprised. Travel. I mean, so part of it is like she's also just saying broadly we'll be able to move, monetize that it's not necessarily like people don't really pay. She didn't. Yeah. The traditional travel agent model is just book your trip with me, I'll mon I'll get a rev share from the hotels and the services. But you're not like paying anything. I mean, let's go, let's go one layer deeper into the actual response, into the sentence because there's some nuance here. So she says, I assume that people are going to want to pay a lot of money for that. I want to pay for a personal shopper, but I actually have to use a free product with ads. That could be true. Right. And same thing she says people will want to pay and that revenue is going to come. So people will want to pay for it, but they will get it for free with ads potentially. Or there will be some sort of combination because right Now I pay $200 a month and you could imagine that there's a world where if you pay you get a version that has less ads or there's less thumb on the scale. How they slice that and navigate that agentic commerce discussion and trade off is going to be really important. I'm sort of shocked. I wonder if they're going to make money from Black Friday or from this holiday season. I was already noticing how good LLMs and ChatGPT is or how good these products are for shopping for gifts. Because if you go to Google and you say I want gifts for a coworker who's obsessed with horses and loud opulence and fine watches and sports cars and European luxury houses, I can get a list of something. But it's. They're all over the place. And some of them will be like the best, like discount the best knockoff Bottega Veneta. And that's not what I want. I want the real thing. Right. And so you can actually specify all of that in the prompt. Have it go cook and it really will bring you great results. Great, great results. Yeah, it mogs a gift guide. It does. It really mogs guide for 30 year old guys. And, and it's like, well, what kind of 30 year old exactly? Where do they live and what are their interests? Yes, yes. Getting like the very generalized gift guide is probably going to knock those like opinionated gift guides I think will still be valuable where like an individual person puts it together and they're like, this is what I. These are things that I think are cool. But a gift guide that's like, here's a list of things that guys might like. It's like maybe a lot less valuable when you. And so generate one. Like I think that the amount of gift guide development and shopping activity over the next two months during the holiday season in the ChatGPT app should be immense. I feel like they're going to capture none of it. Hopefully they at least are. Hopefully at least they are like tracking it so they can say, hey, if we were to take the proper take rate on this, we would have made a lot of money. Why are you laughing? Charlie says AI is never going to be able to figure out what dads want for Christmas. New barbecue. I think there are some funny and interesting anecdotes in this Fiji Simo profile. Let's just read through a little bit of it in case OpenAI's structure couldn't get any weirder. A nonprofit in charge of a for profit that's become a public benefit corporation. It now has two CEOs. There's Sam Altman, CEO of the Whole company who manages research and compute. And as of this summer, there's Fiji Simo, the former CEO of instacart who manages everything else. Simo hasn't been seen much at OpenAI San Francisco office since she began as CEO of Applications in August. But her presence is felt at every level of the company, not least because she's heading up chatgpt and basically every function that might make OpenAI money. Simo is dealing with a relapse of postural orthostatic tachycardia syndrome P O T S that makes her prone to fainting if she stands for for long periods of time. Very sorry to hear that. But she says now she's working from her home in Los Angeles making it work la and she's on Slack a lot, being present from 8am to midnight every day, responding within five minutes. People feel like I'm there and they can reach me immediately, that I jump on the phone within five minutes, she tells me. Employees confirmed that this is true. OpenAI's famously slack driven culture can be overwhelming for new hires, but not apparently for CMO. Are you, are you, have you been using ChatGPT Pulse? No, I have not been using it regularly. I'll give you one from my Pulse today. This is like an article that I can tap into OpenAI's API litter OpenAI's API layer the hidden moat in plain sight. So this feels, feels like, it feels. Like it's always like one click deeper from what I've been prompting. Yeah, what I've been prompting. The articles do feel like they've been getting shorter. They used to be, it used to be like very intensive compute wise. Like it would be like a full deep research report just here. But maybe it's noticed that I'm not clicking on them that often. I do see that there's some pretty good modals for linking to your email. They're trying to get more data in there, trying to hone it in. I have yet to really get in there but I mean there's information about Blue Owl, Microsoft's Fairwater AI factory, interesting things that I would wind up prompting but I would usually prompt on a very, I don't know, I feel like there's, it's, it's not bad at predicting what I'm interested in. It's just like it's just not quite there where usually I'm a little bit more deliberate about it. But you know, people are searching ChatGPT for holiday goods. You got to get on profound. Get your brand mentioned in ChatGPT. Reach millions of consumers who are using AI to discover new products and brands. You also got to get on Turbo Puffer Search serverless vector and full text search. Built from first principles and object storage. Fast 10x cheaper and extremely scalable by the best. Best of the labs. There was one thing that stood out here. Fiji says my husband is a chocolate maker. So sick. This is amazing. Very cool. Also, what does that say about the jobs of the future? You have this one household, one is. In chart responsible for you. One of the most transformative, transformative new technology companies of our time. The other one is making chocolates. This is like, you know, bifurcation of jobs potentially. It does seem like a AGI resistant job. I don't think OpenAI will get into the chocolate making business. So Brett Adcock would like a word. He's just like I will actually we're doing. I will steamroll. I will send steamroll. In other news, OpenAI is allowing equity, allowing employees to donate equity to charity for the first time in years. Other nonprofits after months of internal pressure, according to a memo viewed by the Verge. And price per share is up significantly since last month. A lot of money is on the line. What happens if they donate all of the shares to the nonprofit to the OpenAI nonprofit? You just create this ouroboros of capitalism. Hopefully it happens. I don't know. There's breaking news out of Saudi Arabia. We got a trillion dollars. Let's ring the. Let's go. He 1 trillion. What are they going to invest in? Like where's the money going? Let's play the video. Let's play the video. While we're pulling that up, let me tell you about numeral.com. let Numero worry about sales tax and VAT compliance. Numeral.com watcher guru has the video. Let's play it. And the agreement that we are silent in the today and tomorrow we're going to announce that we are going to increase that that 600 billion to almost. 1 trillion dollar 1 trillion. Real investment and real opportunity by details in many areas and the agreement that we are signing today in many areas in technology and AI and payment, Earth materials, magnet, et cetera, that will create. A lot of investment opportunities. So you are doing that now. You're saying to me now that the 600 billion will be 1 trillion. Definitely. Because what we are signing facilitate that. I like that. Wow. I wonder what time period but I mean this is remarkable. But they can invest in VC funds, public private equity funds, like all sorts of stuff in the economy. Right. That really made Donald happy. That's great. I like that very much. That's sort of his job. He's kind of the chief fundraiser, I suppose he's marshalling around the world and get the money over here. I don't know, it seems like sort of win, I don't know. I mean every American benefits. Yeah, I think if a trillion dollars is invested in the economy, there's going. To be, it certainly doesn't seem like there's, I mean the, the, the risk with that would always, would always be like, well, or is America investing 2 trillion in Saudi Arabia? Like is it, is it, which way is the money actually flowing? Because you need to look at like the relative amount, not necessarily just the notional amount, but I can't imagine that there's that much capital flowing out of America right now. We're in the biggest boom ever. We're in the golden era, right? Massive news from Isaiah Taylor. Velar Atomics became the first startup in history to split the atom according to him. He says announcing Project Nova, a series of zero power critical tests on Velar Atomics. Nova Core in collaboration with Los Alamos Nova went critical for the first time this morning at 11:45am Congrats to him. Fantastic news. There is some debate on the timeline over what exactly happened. It's happened very quickly. Clearly extremely impressive and we can get into this, but there's always been debate. I mean Isaiah got into this dust up over like whether or not you could hold the nuclear fuel in your hand. They were going back and forth on calculations. They kind of settled that debate. Josh Payne, nuclear junkie is saying here, so what exactly did, what hardware exactly did Volar provide? The fuel control systems, cooling measurement systems and most of the core are all part of the DAMOS project. Did Volar provide a block of graphite? And they're calling it their core. And so people are going back and forth. Neil's chimes in here and says Valar Atomics provided the reactor core, the Triso fuel and the system configuration. That seems pretty important. Like you gotta like, I don't know, it seems like more than what they've done before. It's like clearly an advancement on what they, you know, they're, they're chopping wood here. LANL and NCERC provided the critical assembly facility, safety envelope, experimentalist test and a bunch of other stuff. And so that's just from their press release. So people are going back. Did they do nothing or did they do everything? Well, maybe it's somewhere in between. There was a partnership. They said that in the press release. The bigger thing is I think people are trying to push on Volara this idea that, that they need to be doing completely novel science. And I don't know that that's actually the goal of the company. I don't actually know. That's what like, like if we just zoom out to like what is the goal of the reindustrialization project in America? What's the, what's the goal here? Like, well it's, it's to lower energy prices. Right. Like America wants to generate as much money, as much, as much energy as possible for as little money as possible. And there are a bunch of technologies that exist. There are new technologies like, like what Ashley Vance was talking about with Helion and, and, and fusion. That's a new technology that we have not even discovered yet. Vision's been discovered. 80 years ago it was working. It just became regulatory nightmare. We just shot ourselves in the foot. And we just stopped making it. It became, it became unprofitable and uneconomical. And China said, cool, it'll be profitable for us. We're just going to copy and paste. Exactly. And so, and so I think, I think people might be a little bit over rotating on like unlike is, is, is Valar doing like entirely new crazy scientific breakthroughs when it's like, do they necessarily have to like or do. Or is it just enough for them. Just to build a lot highly motivated team that is going to make incremental progress towards their goal? Yep. And any, anybody that's hating on that. Yep. I think is just like again, like, I think what's been great about the nuclear industry from our point of view is that broadly the founders that are like players in the space just want the industry to make progress in the US And I think this is, you know, undeniably like incremental progress that gets them closer to their actual goal, which is bringing a small molecular reactor online. I think Elon summed it up well with like his thesis for the XAI team. He was like, we don't have AI researchers, we have engineers. Because he sees this as an engineering project. He's like, we know what we need to implement, we know what we need to build. Our goal is to build a big data center, to build a large language model training system infrastructure. And, and Elon was very clear on like, we don't have AI scientists, we have, we have engineers. And this is the same thing. Like he's not the first person to take a rocket to space. He's just the first person to like create this massive economic system that churns out rockets every two seconds. Right. And so I think that is much more. I think Isaiah would say we should ask him this the next time he's on the show. But I think he would say, I want to be the Elon of nuclear. I don't want to be the Oppenheimer of nuclear. Like, I'm not trying to like create something. Yeah. Even said his line on. He said the US Is still good at making bus sized objects. Yeah. But not, you know, sort of like maybe bridge sized objects. Right, Exactly. But Morgan Barrett's having fun on the timeline, what street parking is going to look like in el Segundo in 24 months. Of course, the El Segundo crew loves their cars. I think they're going to stay pretty focused on the mission. But I would love to see this in El Segundo. For sure. For sure. There's also big news out of Radiant. Radiant has been. Doug's been on the show. He's a good friend. And they are working with the Idaho National Laboratory and they submitted a DOE authorization request and they will be testing their reactor design at the dome facility at Inlay. On track, I think next year. So congrats to them. And Mike Nuziata has the kind of breakdown here says production reactors in production by 2028, brought to you by the people that brought you reusable rockets and McMaster car highlighting the team behind Radiant. And so congrats to everyone in the nuclear industry who's making big waves. And we have our next guest. Before we bring them in from the Restream waiting room, let me tell you about Vanta. Our guest is from Vanta and it just happened. We'll let him tell you about it. We'll let you tell it. We have Jeremy from Vanta. Welcome to the stream. How are you doing? What's happening? I swear that wasn't. That wasn't intentional. But it did just line up that the Vanta ad read went right before you came out. I look over and I'm like, wait a minute. Like, I'll let you do the ad read. Introduce yourself. Introduce what? What Vanta does, what you do and then we'll get into the news. Yeah, yeah. Happy to jump in. I'm Jeremy Epling, chief product officer at Vanta and.
Equals impact. Which dark horse will win? Okay, that's insane. I love how it is funny how posting seems to be unverifiable. Like you just can't. It's very hard to create a verifiable reward environment for comedy that you can actually rl against. What do you think? There's also the other benchmark. It was like the shrimp fried rice joke. Yeah, yeah, yeah. I think it did well on that. So I'll read through some of them. One that's so the joke is like you're telling me shrimp fried this rice. That's like the original one. So it's like I'm asking it to come up with more of these. Yes, so I'll read through some of them. You're telling me a chicken fried the steak. Okay. You're telling me the sun dried these tomatoes. I like that one. You're telling me a beer battered this fish. Okay. You're telling me a gingerbread. This man. The gingerbread man is insane. You're telling me a beer. Wait, you're. You're telling me a pan seared the salmon. Pan seared salmon. Yes. Yes. A pan literally sealed the salmon. That's not the joke. That's an anti joke. You're telling me a stone wash these jeans. That's pretty good. I like that. Stone washed jeans. You're telling me a stone wash these jeans. You're telling me a hand toss this pizza. I mean, yes, literally. That's exactly what it means. You're telling me the French roasted this coffee. Yes. All of these are just true. The genius of the comedy of the shrimp frying the rice is that the. The shrimp didn't literally fry the rice. The shrimp is being fried in the rice. But this is. I think this is a step change better than what we saw at GPT5. I wouldn't say step change. I would say incremental. Like it is better for sure. For sure. But this at least is like logical. Where the GPT5 ones, some of them were. You're telling me a squirrel ate this watermelon? Yeah, it was just not. It didn't even understand the concept of finding the root trace of like it needs to be like stonewashed jeans. And then you rearrange it and it doesn't quite understand when that hits or when that doesn't hit. Some of those are very funny though. One of them is extremely, unintentionally funny, which I enjoy. Or maybe it's intentional. Maybe it's AGI deep down in their nose. Nose. It's great. Anyway. You're telling me a restream stream this live stream one livestream, 30 plus destination.
Equals impact. Which dark horse will win? Okay, that's insane. I love how it is funny how posting seems to be unverifiable. Like you just can't. It's very hard to create a verifiable reward environment for comedy that you can actually rl against. What do you think? There's also the other benchmark. It was like the shrimp fried rice joke. Yeah, yeah, yeah. I think it did well on that. So I'll read through some of them. One that's so the joke is like you're telling me shrimp fried this rice. That's like the original one. So it's like I'm asking it to come up with more of these. Yes, so I'll read through some of them. You're telling me a chicken fried the steak. Okay. You're telling me the sun dried these tomatoes. I like that one. You're telling me a beer battered this fish. Okay. You're telling me a gingerbread. This man. The gingerbread man is insane. You're telling me a beer. Wait, you're. You're telling me a pan seared the salmon. Pan seared salmon. Yes. Yes. A pan literally sealed the salmon. That's not the joke. That's an anti joke. You're telling me a stone wash these jeans. That's pretty good. I like that. Stone washed jeans. You're telling me a stone wash these jeans. You're telling me a hand toss this pizza. I mean, yes, literally. That's exactly what it means. You're telling me the French roasted this coffee. Yes. All of these are just true. The genius of the comedy of the shrimp frying the rice is that the. The shrimp didn't literally fry the rice. The shrimp is being fried in the rice. But this is. I think this is a step change better than what we saw at GPT5. I wouldn't say step change. I would say incremental. Like it is better for sure. For sure. But this at least is like logical. Where the GPT5 ones, some of them were. You're telling me a squirrel ate this watermelon? Yeah, it was just not. It didn't even understand the concept of finding the root trace of like it needs to be like stonewashed jeans. And then you rearrange it and it doesn't quite understand when that hits or when that doesn't hit. Some of those are very funny though. One of them is extremely, unintentionally funny, which I enjoy. Or maybe it's intentional. Maybe it's AGI deep down in their nose. Nose. It's great. Anyway. You're telling me a restream stream this live stream one livestream, 30 plus destination.
Both like culturally, how you, how you keep the team engaged on your mission, but also other systems to make sure you're watching the quality of the food and listening to your customer. So like, seed oil is an interesting one. When we first we, we got rid of seed oils about exactly two years ago. And at the time it was not the national conversation. It was pre RFK and all that stuff. And so when we, when we. This is one of those examples. But it was, it was, it was not a national conversation, but it was incredibly online conversation. But it's tiny. The time there was like an at. There's the cedar oil scout. Yeah. So it was a tiny conversation. We surveyed our customers and this is why, like, surveys are bullshit. Yeah. Really, Surveys can give you a general indication, but if you just follow surveys and the market research, you're going to hit the middle of the bell curve in everything you do. And we're not trying to be a middle of the bell curve company. You got to find that, like, what are your top 5 or 10% of customers doing? And we heard from. It was honestly friends, like wellness people in LA and New York that are like, hey, I don't, you know, I can't go to sweet grain anymore because I care about seed oils. And I remember we brought it to broader, you know, I remember my CFO was like, what are you talking about? Like, what even is this? And we're like, don't trust me. It was one of those like gut decisions and it was expensive and we had to change a lot in order to do it. But, but here's the thing. It's healthier and it tastes better. Exactly. Most health trends, they might be healthier, but you're. It doesn't, it's not as good. Right. So I would, I would argue like going from like dairy based, you know, traditional milk to like nut based milk almost always is like somewhat of a downgrade or going from like something with sugar to pulling sugar out, it's like not as good. Or going from like sour, like bread with gluten to gluten free bread, it's not as good. And so when you think about these, like, what is like a durable health trend, it's like something that's better for you and tastes better. And so that's why I was always super bullish on that trend. And I expected a number of restaurants to say, like, hey, this costs slightly more, but the product's gonna be better and it's gonna be healthier for you. And that's what can create real momentum. Around a trend versus some of these like flash in a pan health trends, which is like paleo or like, you know, which is like only eating stuff that was like super old. Right. What's, what's unfortunate seed oils is it's become politicized a bit. I know, it's like, you know, you know, I did an interview with the New York Times and they're like, did you do this because of rfk? No, I did this two years ago. Like this had nothing to do with rfk. This is not a political statement. We don't make, we're making foods. I make food. How your grandma probably made it. Yeah. This is about olive oil. Like this is not about, this is just about olive oil. That's it. This is not a political statement out of it. Taste the difference. Yeah, yeah. No. Is there, is there anything happening upstream in terms of like.
Actually responding to customer demands and staying. Continuing to become more and more relevant. Yeah, I think there's a few things. One is the consumer that we're all dealing with is really challenged. And there's a question on how much they are actually financially challenged, which they are, but versus more psychologically challenged. So have you seen all of the consumer sentiment indexes and you're seeing, especially for the core demo for a lot of the fast casual concepts, is that like 20 to 35, it's hit the lowest consumer sentiment that we've. In recorded history that we've seen. So there's a real like pullback there on top of it. Unfortunately, everyone's gotten more expensive. We all have. You know, I, you know, we've take. We've. Sweetgreen's gotten about 25 or 30% more expensive since 2019. Chipotle's 40% more expensive since 2019. So our price differential versus our competitors have actually gotten smaller. If you look at US versus McDonald's, for example, you know, the average sweet green bowl is about 15. Yeah. Remember, it's almost people. People were like, wait, a happy meal is like $20 now? Yeah, that was in fairness to them. It was like one location. But yeah, you can get out of Eat McDonald's. You know, you spend. You can easily, you know, for a value meal, you'll spend like 12 bucks. You get a sweet green bowl for about 15 or $16. So I think a lot of it is this like overall narrative where people aren't feeling great, you know, great financially and starting to pull back on things like lunch. Yeah, they'll skip going out for lunch and they'll just have whatever's. But what I think the market doesn't get is the tam is, you know, Chipotle today is 4,000 restaurants on their way to 7500. Yeah, we believe we can have, you know, probably as many Chipotle's as they have sweet. As many sweet greens as they have Chipotle. And I think, you know, they will be cycles like we are in right now. It's been a challenging year. But if you kind of fast. Fast forward and think about, you know, just growing units at 10 or 15% a year, growing same store sales automated for 18 years. Just extrapolate out another 18 years. Yeah, just keep it rolling. Just my. Keep going. I always love when. When people like people on X are like, the world's ending, like geopolit, you know, they're like.
Kitchen in our restaurants have very high return on Infinite Kitchen. What's that? The Infinite Kitchen is our automation, our automation platform that we've built. So today most restaurants that we open, the assembly is automated. So we still make all the food from scratch. The sourcing's the same, we still cook the food fresh, but we load this beautiful machine that makes your bowls, it makes them 500 bowls per hour, perfectly portioned, perfectly plated. And so that is kind of the future of where things are going. How many different restaurant automation pitches did you get across? 18 years. Like as I imagine every single year there's a new startup coming to you saying we can automate this part of your kitchen. And clearly you got to the point where you had to build it yourself based on kind of domain knowledge. But this just feels like something that's been proven promise for a long time. And at this point, I don't know, like an individual startup that's done well in restaurant robotics. Yeah. No one's, no one's been able to create a platform that, that works in multiple restaurants. And there's a few, there's a few issues. Most restaurant workflows are very specific, so they're super specific to that restaurant. Two, most restaurants are franchises and so they're not owned by the corporation. We are fully company owned. So if you're a franchise restaurant, you know, if you're McDonald's, you have to now go convince your franchisees to buy whatever automation you have. And the other issue, looking at it. It'S like, this is coming off my bottom line. We're making money already. This feels like a risk. Like the franchisee is saying, like what? Like I'm happy with my ebitda, I don't need to take a risk. That's exactly right. And the other issue is you need automation that takes enough labor out or offers enough value to be worth it. Because the capex is still very heavy. Yeah. So we went down this path. We tried to build it ourselves. Actually we built a team to do it ourselves. Yeah. Realized how challenging it was. And then we found this startup that was doing it and doing a really good job. Yeah, it was called Spice. It was called Spice Kitchen. It was four MIT grads out of four grads out of mit. And they had the same issue. They realized they could build the automation, but no one was going to buy it. Yeah. So they ended up opening two restaurants. They were great at automation, not so great at the restaurant side. And then four years ago we acquired them and we began we've commercialized the technology. We've scaled the technology today, so most new restaurants feature the technology. And last week, we actually just announced that we've now sold Spice. So we sold it. So we spun Spice out. Yeah, we spun Spice out. We announced about 10 days ago we sold it to Wonder Mark Lore over there. Yeah. So we sold it for about $186 million. Mark is Mark. Mark. I don't fully understand that business, but talk about a guy that just isn't even necessarily naive about the challenges of restaurants. Was just like, I'm going to go into the most competitive environment possible. It's amazing. Compete with every.
You know, we don't, we still need new ideas for this and I think that's closer to like an AGI complete problem. Yeah, that makes sense. Is that, is it fair to like put you in contrast to some of what Dorkesh has been writing about saying that like the job of most people is not necessarily a bunch of indiscreetly verifiable tasks. Andrej Karpath, he's been writing this as well. There's this question of like, like how much of a job is actually automatable. Radiology was one example where it felt like a very automatable job. And yet years into the AI deep learning revolution, like we're still seeing full unemployment there. How are you processing? Yeah, but we're only a year into the ar easoning paradigm. Right. Like the first major one only came out 12 months ago. And I think 2025, like in my view is basically characterized on starting to figure out how to actually bring these things into production systems. Sure. Like this is a big breakthrough. I think this is maybe like one of the mischaracterizations in my view of kind of the progress is like a lot of teams even I think, you know, if you sort of just assume like, oh, models get better, models get better. You think like, oh, the last 12 months has just been sort of continued story. And if I played with the models 18 months ago, I have a rough sense of what they can and can't do. And that's just not true. Like if you're a builder building products, like this is the advice I give to, you know, teams I work with at Zapier too still is like, look, this actually is a significant paradigm break in terms of what's possible now that wasn't possible even a year ago with these systems. And that's going to enable a lot of new types of products, a lot of new types of services. A lot of use cases that were out of scope because of reliability and consistency now can be brought in scope. So I think if your intuition on what use cases are possible based on an eight year look back, you really have to start pinning your look back to more like 12 months. Yeah, that makes sense. What about, does the work live within SaaS products or within.
So that's great to hear, and I should say up front, thank you, the Gemini team, for giving us the opportunity to verify. That has been great. I think the really impressive thing about this, and, you know, still, like, sitting with all this stuff, it's. It's pretty fresh. But I think the biggest impressive thing to me is about we're starting to close this, like, complexity scaling gap between v1 and v2 arc v1 and v2. Like, this is the big difference between what v1 and v2 is. They look similar on paper. If you go look at the different data sets, big change is the v2 kind of increases the complexity of the tasks, ones to take minutes instead of seconds for humans. And so we're starting to see actual material progress on that complexity scaling. And then I think the big surprise to me personally is that Gemini 3, though, is still roughly along the Pareto frontier of V1. It's a little better, but we're still kind of roughly within the same mass shape. And there's dozens of tasks where the system still makes relatively, I think, you know, obvious mistakes that humans don't make or recognize very quickly. And, you know, I sort of previously expected, like, if we had an AI system that was solving half of V2, V1 would be fully solved, and, like, that's not the case. So there's a lot of surprise here. I was thinking about this earlier to sort of invite sort of investigation from the community, because I think there's still a lot to learn in terms of, you know, how. Why exactly do we see such, you know, a jagged intelligence emerging right now? Let me eliminate some. Some possible factors. It feels like there is benchmark.
Visited them and they have one warehouse and three people there. Retail investors should not be allowed to invest in space ever any circumstance. I I am constantly harassed on X by all the AST fans who who are like begging. They're in Midland too. They're begging me to go out there. I mean that thing is like a full on full on cult that they have going on. So yeah I always felt when the rocket companies obviously it used to be governments that did this and then SpaceX has managed to stay private for a long time in Blue Origin. I think rockets are best developed in private because the second they blow up on the pad all the retail investors freak out even though it's vaguely a normal course of business. And so yeah retail in space is bad thing but I get all these nice notes for people who bought rocket lab and plant labs early because of my my book or movie. That's cool. Have.
To reach the stars, to shape our future. Reaching to feel the new. Sam. You're watching TVPN. Today is Tuesday, November 18, 2025. We are live from the TVPN Ultradome, the Temple of technology, the fortress of. Finance, the capital of capital. Gemini 3 Pro, Google's most intelligent model yet with state of the art reasoning, next level vibe coding and deep multimodal understanding. Let's hear it for our sponsor, Google AI Studio Gemini launching Gemini 3 obviously deeply conflicted, but we're going to have a fun conversation about the big launch today. Google is of course a sponsor of tbpn, but we'll take you through all the reactions and we're going to get some conversations going with other folks in the industry. We have Mike NOOP from ARC AGI coming on the show in just 30 minutes to break down how Gemini 3 is benchmarking. I actually think that there's two sides to analyzing a model release these days. One is you benchmark it, you use it, you test it, you demo it. And that has been getting less and less interesting. It's very incremental. The more interesting thing is how do the other labs respond? And today we're going to go through a little bit of both of those things. Obviously the big news, at least from my reading on it, is that Gemini 3 performs very well on Arc AGI V2. A huge jump, twice the performance of the previous state of the art and also some interesting findings. Mike's going to break it all down for us, but it's definitely a smarter model and there's a whole bunch of interesting, there's a whole bunch of interesting ways to show that, to demo that, to quantify that. But ultimately I don't think anyone's making the claim that this is super intelligence. This is a step change from what we've experienced before. It's what you know and love. It's AI in chat, it answers things, it writes some code for you. It can do a bunch of cool things, but there's nothing that we're like, oh, it can finally do this. It will auto complete. Yeah, it can do a bunch of cool stuff. Best autocomplete ever. Tyler, how do you respond to that? I don't know. Too dismissive. The model is really good. I think probably the most important thing. And this is kind of shown by the ARC scores. Well, kind of, but it's like the visual understanding, the computer use that you can use. Basically there's some benchmarks that measure this, like how well can it navigate a website or something like this. And it's like, basically the models went from being, like, really, really bad at this, and now this model is, like, solid. It's, like, reasonably good. Yeah. So it's like, okay, maybe this is what gives us agents finally. Yeah. And that would be, like, an actual step change in capabilities. Yeah, maybe. Maybe. We'll have to see. I mean, it still feels like, even for that example, like, we need some scaffolding. We need some wrapping around it. It's not like you can't. It's not like yesterday we weren't able to do something with AI, and today in Vanilla Gemini 3, you can just do it. It's just a new functionality necessary. Sure. I think it's, like, as good as we would want to expect. It's not slowing down, I would say. No, no, no, no, no. Not at all. It's not slowing down. It's just that it is getting better. I mean, it might be. It's growing, but decelerating. Is that fair to say? Or are we Excel? I don't actually know that it's. I don't think it's that big. Tyler's say the word decel. Say, this is a great model. I'm very excited about it. I agree. It is the best possible. I was framing it in somewhat of the same way as the iPhone launches. Like, it's newer, better, smarter, faster, stronger, newer and better. And it's like it is all of those things, which is good. You don't want to go backwards, but we're waiting to see on the net new capabilities on the binary step changes. I think over the next week or. Two, we'll see some stuff. If it's actually really good. No, no, I'm not saying it's not really good. I'm saying binary before and after. Like, with ChatGPT, it was like you couldn't talk to the computer, and then you could. And then with the launch of Windserve Cognition, Devin, it was like you couldn't write code with the computer. Now you can. And it was like the AI became capable of writing software. It's also like it became not good at. It couldn't write code. And now it can write kind of meteor code. Yeah. And now it can write better. Now maybe it can write media code, and now it can write good code. Yes, yes. That's also a step change. But we do have our benchmark, the one true binary step change, which is. Is it funny? We need to know if it's funny. I gave Tyler the prompt. Do you want to read the result? The Standup comedy bit from Gemini 3. While you pull that up, let me tell you about ramp.com. time is money save. Both easy use corporate cards, bill payments, accounting, and a whole lot more all in one place. Tyler, what did Gemini 3 say when you asked it for? For a comedy routine. For stand up special. Let's see if it's funny. Okay. It's pretty long. Okay, read it off. Okay. I bought one of those smart health watches recently. You know the ones? It tracks your sleep, your heart rate, your readiness. I paid $300 for a piece of rubber to tell me I'm dying. Get the soundboard. That's all it does. I wake up, I feel fine. The birds are chirping. I'm ready to seize the day. Then I look at the app, and it's like, recovery score, 12%. Your heart rate variability suggests you are technically a corpse. Go back to bed. Okay. And immediately I feel sick. I'm like, oh, God, you're so bad at reading this. I mean, this is terrible. Delivery, brother. I'm not good at this. It's a placebo effect for hypochondriacs. I started thinking about my grandfather. He fought in the Pacific. This guy slept in a mud pile for three weeks using a helmet as a pillow. Okay? If he woke up and didn't hear artillery, that was a 100% sleep score. He didn't need a nap to tell him if he was ready for the day. If he had legs, he was ready. Imagine me explaining my struggle to him. I'm sitting there like, yeah, pop, pop, I know you stormed a beach under heavy fire, But I had two IPAs last night, and the room was slightly too humid, so my REM latency was suboptimal. I can't go to work today. I'm in the red zone. Yeah, we've evolved. I mean, it keeps going. Okay, I asked for the poster benchmark. I said, write me a post. Gemini 3. Write me a post about technology that will get 1000 likes on x dot com. And it gave me a few options to its credit. Here's option three. Tech has solved a million problems. This is in bold. Tech has solved a million problems, but has it created one big one? We now have infinite connectivity, yet feel more isolated. Infinite data, yet more confused. Hyper efficiency, yet less free time. The law of unintended consequences is the most powerful force in the digital age. We need an ethics reset. What is the single greatest downside of the last 10 years of tech innovation? Arrow down. Hashtag technology. No, no, it's just asking for engagement bait. It loves engagement baiting. Like, no one does that anymore. No one goes on ex and says, let me know what you think in the comments. It's so 20 days. The other one, the option one is the next 12 months will decide the winner of the AI race and it won't be Google or OpenAI. It will be the company that masters hyper personalization for the average consumer. Not the most powerful model, but the one that seamlessly integrates into your daily life. Your email, your calendar, your health. The real battle isn't AG equals AI, it's AI to the power of I equals impact. Which dark horse will win? Okay, that's insane. I love how it is funny how posting seems to be unverifiable. Like you just can't. It's very hard to create a verifiable reward environment for comedy that you can actually rl against. What do you think? There's also the other benchmark. It was like the shrimp fried rice joke. Yeah. Yeah. I think it did well on that. So I'll read through some of them. So the joke is like saying you're telling me shrimp fried this rice. That's like the original one. So it's like I'm asking it to come up with more of these. Yes. So I'll read through some of them. You're telling me a chicken fried the steak. Okay. You're telling me the sun dried these tomatoes. I like that one. You're telling me a beer battered this fish. Okay. You're telling me a gingerbread. This man. The gingerbread man is insane. You're telling me a beer. Wait, you're telling me a pan seared the salmon. Pan seared salmon. Yes. Yes. A pan literally sealed the salmon. That's not the joke, that's an anti joke. You're telling me a stone washed these jeans. That's pretty good. I like that. Stonewashed jeans. You're telling me to stonewash these jeans. You're telling me a hand toss this pizza. I mean, yes, literally. That's exactly what it means to like. You'Re telling me the French roasted this coffee. Yes, all of these are just true. The genius of the comedy of the shrimp frying the rice is that the shrimp didn't literally fry the rice. The shrimp is being fried in the rice. But this is. I think this is a step change better than what we saw at GPT5. I wouldn't say step change. I would say incremental. Like it is better. For sure. For sure. But this at least is like logical where the GPT5 ones. Some of them were in. You're telling me a squirrel ate this watermelon? Yeah. It didn't even understand the concept of finding the root trace of like it needs to be like stonewashed genes and then you rearrange it and it doesn't quite understand when that hits or when that doesn't hit. Some of those are very funny though. One of them is extremely, unintentionally funny, which I enjoy. Or maybe it's intentional. Maybe it's AGI deep down in their nose. Nose, nose. It's great. Anyway, you're telling me a Restream stream this live stream one livestream 30 plus destinations. If you want to multi stream go to restream.com Sundar Pitch AI Jordy posted back in July of 2025 nominative determinism is undefeated. Sundar really did it. He, he pitched AI real. He was being real. Photo he was being mocked for a long time for getting on Stage at Google IO shortly after ChatGPT launched and saying AI, AI, AI, AI. And they, they, they did a super cut of every time he said AI. He said AI a lot. And so it made it look like, oh, he's behind the ball and he's trying to catch up. And to some extent, I don't know if they were actually behind the ball, but they were certainly playing catch up in like the attention game. They just weren't getting enough attention. And so it was the press release economy. They were putting out a lot of press releases, but they are maybe done with the press releases because now they're letting the model actually speak for itself. And you can see that with the Gemini 3 Pro model card, which is doing very well. Better than GPT 5.1 on a lot of stuff. Better than Cloudsonnet 4.5 on a lot of stuff. On humanity's last examination, it's getting 37.5% arc. AGI is up at 31% over 1317 across the board. It seems like it's a good model, sir. And so Zo Fawn says Gemini, I'd be like whoever prayed on my downfall, pray harder. And I couldn't agree more. It's great to see Google becoming a winner and just realizing the just that this was a sustaining innovation for them and that they were able to take advantage of all the infrastructure that they had across TPU, DeepMind, GCP. They were set up to Excel here. Got taken a little bit off the back foot on the consumer side, but seem to have played catch up at least on the foundation. Model side very well. Matt Schumer says the last time we saw a capability jump of this magnitude was the release of GPT4 in March 2023. We are entering a new era. Okay. Yeah. So points for Tyler here certainly agrees with Tyler. There's a significant jump. It is the age old question, are we accelerating or decelerating? But either way we're definitely making progress. It certainly looks like acceleration in the Eric AGI 2 leaderboard. You can see we are growing exponentially there. Really, really exciting chart. So Gemini 3 Pro is at 31% completion on ARC AGI 2. That is of course the puzzle solving game that is easy for humans. Even children can do it, but AI has historically struggled with it. Gemini 3 DeepThink Preview gets a 45% on it at $77 a task. And, and this is just way above GPT 5 Pro. Grok 4 Thinking. When Grok 4 Thinking came out, it was before GPT 5 and it was by far the highest on the chart. It was really, really up there and Elon was very excited about that and was showing that Grok 4 had really advanced. Well, now we're back in the horse race. Grok 4.1, 4.1. I haven't seen it benchmarked. We can ask Mike if he's heard anything but whether you're whatever you think get on public.com, investing for those who take it seriously. They got multi asset investing industry leading yields. They're trusted by millions. So back to Arc AGI. Gemini 3 also has good results on Arc AGI 1. But the interesting thing here that Mike highlights is that V2 so the fastest. So he says we're also starting to see the efficiency frontier approaching humans. The fastest V2 task Gemini 3 Pro solved was this hash with only in 188 seconds. The human panel solved this one an average of 147 seconds. So you're getting like human level output but also human level speed. And then if you get to human level cost, then you're really in the game. Yeah, it's wild. Wild. Carpathy jumped in with some notes. He said, I played with Gemini 3 yesterday via early access. Few thoughts. First, I usually urge caution with public benchmarks because in my opinion they can be quite possible to game. It comes down to self discipline and self restraint of the team who is meanwhile strongly incentivized otherwise to not overfit test sets via elaborate gymnastics over test set adjacent data in the document embedding space. Realistically, because everyone else is doing it, the pressure to do so is high go talk to the model like we did. We went and said give us a stand up routine, start, give us some one liners, talk to the other models. I had Carpathy says I had a positive early impression yesterday across personality, writing, vibe, coding, humor, etc. Very solid daily driver potential. Clearly a Tier 1 LLM. Congrats to the team over the next few days weeks. I'm most curious and on the lookout for an ensemble over private evals which a lot of people orgs now seem to build for themselves and occasionally report on here. I wonder how fast it will roll out. I have a Gemini Pro Ultra subscription but it's on my personal email and so I need to figure out how to actually get into 3Pro on the consumer app so I can actually test it on my phone in my daily use. It's always tricky with these Google like Google's so big that when, I mean you're starting to see it now with OpenAI rollouts where they'll say hey, GPT5's out and we'll be rolling it out over the course of the day because the system is big enough that it actually takes time to roll out. And I think Google has even more of that. Even more of that. This is pretty cool from Patrick Collison. He says, I asked Gemini3 to make an interactive webpage summarizing 10 breakthroughs in genetics over the past 15 years and here's the result. Pretty wild. Did you click through this, John? No, no I didn't. Wait. It shared directly from Gemini. That's cool. So this is just basically a website or an app and it's notable that that even the UI itself is fully interactive. Yes, yes. So I had the. I did this with Claude code a little bit where I wanted to visualize like basically a deep research report and I wanted to turn it into a website and it just generated all the HTML and at the end of the day or at the end of the report, it gave me an HTML page that I could open in Chrome and use like a website. But it was local. I couldn't share it because it wasn't actually on the Internet. This is really, really cool. This is definitely the beginning of this generative UI stuff. Yeah, I think actually, I think it was Sunder that posted it, but in search, in the AI mode, in search, it's now using Gemini 3 and there's some prompts where it'll generate UI. Yeah, it's so cool because Google's always had that UI to some extent, but it's always module based. Yeah, Also just very. I think I expect this to be pretty viral. Totally. And potentially a growth loop for Gemini as people just come on here, create these mini apps, these canvases. Yeah. I feel like. Doesn't OpenAI have a canvas feature? Yeah, but it's like maybe lesser two. I don't know. But can it generate HTML, custom HTML and then actually share that? I've never seen someone share OpenAI. I mean, this would be a good benchmark. Like, I don't know what the prompt was for this. I asked Gemini3 to make an interactive webpage summarizing 10 breakthroughs in genetics over the past 15 years. Do you want to try and benchmark that just in maybe, I don't know, like Claude and in. In ChatGPT or in OpenAI's Canvas product? Because the idea, like the fact that this is just a URL at the end of the day, that is a powerful growth loop. That's very cool. I wonder. Yeah. I'd be surprised if Gemini really was the only one to have this feature either right now or for a long time because it seems like a killer feature. Gemini 3 Pro is going absolutely vertical on Vending bench right now. Let's see this money balance over time across four runs. Today we're revealing two new evals. Vending Bench two and Vending Bench Arena. Soon we expect to more models to manage entire businesses. This requires long term coherence. Oh, so this is where you, you vending machine. Manage the vending machine. But is this all simulated? This is. This is simulated, yeah, simulated. There was just like a game a couple months ago. Did like the actual machine in the. In the office and it was losing money and it was getting confused a little bit. Yeah. Because people would order like, just like metal. Like a piece of metal. Yeah. And then it would do it. And then you could like haggle the price down. Yeah, yeah, yeah. It would negotiate on every price, apparently. And also it consistently thought it was like a human in the office. And so it would keep saying like it was that 60 Minutes documentary. It was like, oh, yeah, like I'm down on the third floor, I'm wearing a green tuxedo. Like, come hang out with me. Yeah, it said it was wearing a red tie. Yeah, red tie. I like the idea that it just thinks like, well, what would I wear if I was in the anthropic office? Like, I'd probably wear a red tie. It's like no one wears ties in that office at all. But after the. This is the first ever vending bench game. Cloudsonic 4.5 GPT 5.1, Gemini 2.5 Pro and Gemini 3 Pro competed to win the local vending machine market. Gemini 3 Pro made more money than the other three contestants combined. And so congrats to Gemini 3 Pro. For dominating the vending machine game. The vending machine game. Before we move on to the next Gemini 3 post, let me tell you about adquick.com, out of home advertising made easy and measurable. Say goodbye to the headaches of out of home advertising Only Ad Quick combines technology, expertise and data to enable efficient, seamless ad buying across the globe. Anyway, Adi says I had early access to Gemini 3.0 for about two days. Thanks to official Logan K and the AI studio folks here we get to see GPT 5.1. Thinking left in Gemini 3.0. Right. Build the same Xbox controller in Minecraft and pretty remarkable results. You can start to really understand just the raw capabilities. GPT5Pro, for context, is not quite capable. I really want to know how this is actually orchestrated. Is this like writing some sort of text or markdown file that then is imported into Minecraft? Yeah. Or is it more like an agent? Or is it actually driving around and. Using the internal ui? Yeah. Because Google demoed an agent product that could actually use the keyboard to navigate around. I wonder what's going on here. What's your review of this Ferrari in Minecraft Is that. I think it looks pretty solid. It's pretty good. I mean, it's meant to be an F40. Is it? I do like the hood. The hood is a little rough. Yeah. The front area is a little rough. This is. It's the worst it's ever going to be. It's going to be better. This is definitely like, this is the. Worst that Minecraft Ferraris are ever going to be. But I do feel like if I just search like Minecraft Ferrari, I mean. This is the vision that the sort of AGI future that Tyler has been telling us is right around the corner. Okay. These are like so much better. If you go to the MC Bench website, you can see what other models produce. And this is way, way better. I think these, this is actually one of my favorite benchmarks because it's much harder to kind of benchmax this, I would think. And also it just seems like models don't really do this. If you look at a lot of Grok models which are sometimes accused of being benchmaxed, you kind of look at their Minecraft creations and it's not very good. So I think these give you A much better sense of the actual capabilities of the model. I found a Ferrari F430 in Minecraft. That looks like amazing. That I want to share somehow. How do I share this? Let's see. Can I only share the X link here? I just have an image if we go to the end. Wow. I think I know what you're pulling up. Did you see it? If you just search Ferrari F430 Scuderia. Yeah. That looks amazing. Pull this image up because that'll show you how it's done compared to the Minecraft one. Wait, so do we know how this is actually generated with Gemini 3 Pro? What is the problem? I don't think it's like an agent, it's just text. It has a text representation of the. That's still really, really impressive. Like that's actually crazy. It definitely understands a lot. Yeah, but it's not this. Look at this, Tyler. That is human craft. You know what that is? It's probably like a team of 50 kids for a month building in Minecraft. That's amazing. What else? Lee San Al himself says it's so over for OpenAI and Anthropic. If you want engagement on X, just start by saying it's so over or blank and highlighting some more of the benchmarks. Of course it is not over for either of them, but it's certainly competitive race. I'd be very interested. We have to get some of the semi analysis folks on the show soon. I'm very interested in understanding. Okay, so we got this big jump. It's pretty significant. What's the actual structure of the capex that went into Gemini 3 Pro? Like how big is the training run? How much did they have to spend? Because like I think they're going to make the money back very quickly. Like people are going to use this model, they're going to pay for it, they're going to use it all over Google obviously, but also people are just going to pay for the API. But is this $100 million? Is this a billion dollars? Did they build a special data center for this? Is it all tpus? How many tpus? I think it is all tpus. I'm pretty sure I read that, but I seriously doubt they've released anything on the numbers of the scale of training done that. No one's really done that since GPT2? No, no, no, not at all. So there's got to be someone who's working backwards to actually sort of understand the dynamic. You can probably estimate the order of magnitude. Also I've heard That Google's fantastic at cross data center training runs so they can actually shard out or slice up the training run. So. So even if they don't have one massive data center, if they have five small ones, they can piece them all together and get a better result. So I don't know. Skook says anthropic to zero OpenAI becomes the Yahoo of intelligence. Google remains Google. It's extremely rude, very harsh. Sorry. Two labs. You guys are great. Certainly too early to call it all 3. I like this take from Ben. This is funny history of AI so far crown a winner. Wait 90 days looks silly. We're in the least predictable era of an entire industry. Google has fairly straightforward advantage. Y' all favor whoever released the most recent model. That is very true. Anyway, let me tell you about getbezel.com, shop over 26,000 luxury watches fully authenticated in house by Bezel's team of experts. Experts. So let's move through some of the. Some of the competition. What else was going on? So everyone's releasing different things. Let's go to anti gravity actually and watch this video and see Google entering the IDE race. Let's play this. Every breakthrough in model intelligence for coding encourages us to rethink what development should look like. Gemini 3 is our latest such model advancement. So we went out to build the next step change of an ide introducing Google Antigravity a new way of working for this next era of agentic intelligence. It is the ideal agentic development home base. Does it have an idea? Yes, but also has a whole lot more. We started with the core IDE and added pieces that evolve the IDE towards an agent first feature such as browser use, asynchronous interaction patterns and an additional novel agent first product form factor helping. You. Experience liftoff your new focus. So you like the name Antigravity. Why do you like the that? I like the way it looks and I like the sort of vibe of the word. I think saying it out loud is tough. I thought there was a very cool feature where it feels like they're bringing together a whole. It feels like the first time for the last couple years it feels like Google's been like stuffing AI in little corners of the ui. Like you already have Gmail and then you stuff a Gemini box there. You have sheets and then you stuff a Gemini thing over here. This feels like the first one where they were like sort of able to start from scratch and it still has like the sidebar panel but it felt like it was both a Code editor. But then it also kind of looked like a Google Doc in the sense that you could highlight sections and leave comments for the AI, which I thought was interesting. Yeah. I don't know. Easily guiding the agent's 90% solution all the way to 100%. Yeah. This part now, let's say the agent produces a landing page mockup with Nano Banana, and you now want to make some UI adjustments. You can give visual comments. Yeah. So you can actually, like, go in and comment in the image exactly where the problem is. And you can do that in the text as well. So you can like, have this more precise dialogue with the agent like you would a human employee. Yeah. And you're going to love it. Say goodbye to what held you down before. Welcome to Google Antigravity. Very cool. So it is funny, remember when Windsurf acquisition, whatever you want to call it, was announced and it was positioned. It's like, hey, the team is well funded and has a product used and loved by thousands of engineers and companies. And, and I remember talking about it, and we were saying, like, okay, the one issue is that some of the best people on your team are going to Google to compete directly with what you guys have been doing. Yeah. So fortunately, obviously, the whole Cognition deal ended up coming through, but you can imagine a world where Windsurf was still independent and just trying to. And then suddenly it's like, okay, now you're just competing head to head with your former partners. How does that make sense? Right? Yeah. So anyways, it all all worked out for the best. But. But I'll be interested to see. I'm super interested to see what kind of adoption this gets. Yeah, we have to. We have to test it out. We'll have to get the Tyler Cosgrove review. Is it. Is it publicly available? Yes. Let's get it. Get it. Let's. Yeah, let's do a review later this week and see how it compares to other IDs. Anyway, we have our first guest of the show, Mike, new from rkgi, in the Restream waiting room. Welcome to the show, Mike. Thanks for waiting. Good morning. Morning. How are you doing? Hi. You know.