Differentiated Understanding

Nathan Lambert Reflects on China’s AI Labs: DeepSeek, Open Models, and the 'Race' with the U.S.

1 h 3 min · 19 de may de 2026

Descripción

Joining me today is Nathan Lambert [https://substack.com/profile/10472909-nathan-lambert], author of Interconnects AI [null] and a post-training lead at the Allen Institute for AI. Nathan recently returned from a major tour of China’s leading AI labs, where he met with researchers and teams building some of the most impressive open models in the world. In this conversation, we discuss what Nathan saw on the ground: how Chinese AI labs differ from their U.S. counterparts, why open models have become such an important part of China’s AI strategy, and how labs like DeepSeek, Alibaba, ByteDance, Kimi, Z.ai, MiniMax, and others are navigating compute constraints, data access, and commercialization. We also dig into some of the most debated questions in AI today: Are Chinese labs really 6-9 months behind U.S. frontier labs? How meaningful are distillation accusations? Can domestic chips like Huawei’s make up for restricted access to Nvidia GPUs? And is China’s AI ecosystem actually government-directed, or is the reality more fragmented and commercially driven? Ultimately, this episode is a more nuanced look at China’s AI ecosystem that looks beyond simplistic narratives about subsidies, copying, or geopolitics, and instead examines the technical, cultural, and economic forces shaping the future of open models. Check out his two recent articles here: * Notes from inside China’s AI labs [https://www.interconnects.ai/p/notes-from-inside-chinas-ai-labs] * How open model ecosystems compound [https://www.interconnects.ai/p/how-open-model-ecosystems-compound] To find the previous episodes of Differentiated Understanding, see here. [https://aiproem.substack.com/podcast] Every episode, I bring in a guest with a unique point of view on a critical matter, phenomenon, or business trend—someone who can help us see things differently. Season two will host a series of guests from early-stage investing, as well as builders, researchers, founders, and product managers. For more information on the podcast series, see here. [https://aiproem.substack.com/p/launch-of-differentiated-understanding] Chapters00:00 Insights from the China Trip11:51 Cultural Differences in AI Research18:15 The Role of DeepSeek in China’s AI Ecosystem25:26 Overview of Major Chinese AI Labs30:56 The Future of Open Source in AI37:50 Market Dynamics and Consolidation in AI42:28 Distillation and Model Convergence Controversies51:58 The Gap in AI Performance: US vs China61:09 Monetization Strategies in AI: A Comparative Analysis62:32 Government Influence and Misconceptions in AI Transcript (AI-generated for reference only) Grace Shao (00:00) Nathan, thank you so much for joining us today. Yeah, really, really excited to finally hear your thoughts on your big China trip, on what’s happening between the Chinese AI labs and the U.S. AI labs, what you think the potential compute constraints might mean for these labs and their performance in the future, and obviously the open-source ecosystem. So before we get into all of that, could you... Nathan Lambert (00:02) Yeah, thanks for having me. Grace Shao (00:23) Briefly tell us about how you ended up actually working on post-training and open language models. Just a bit about yourself. Nathan Lambert (00:29) Yeah. So I actually started my PhD at Berkeley in 2017, not working on AI things. I was an electrical engineer by training in undergrad, which is funny looking back, because that’s the same year that the Transformer paper came out. And I was like, I think I should do this AI thing, and tried to get the famous advisors to mentor me. And they’re like, we can’t take you. So I had my PhD as this wandering path to become an AI researcher. And then I ended up at Hugging Face after that, which was, realistically, the only industry research job that I had, but also a very hot startup and very fun to learn kind of at the intersection of these tools that people use a lot for AI and research, which is what I was doing. And then when ChatGPT hit, the kind of RLHF thing blew up as the hot word on the technical side of things. My PhD had ended up being in reinforcement learning, which is just the first half of reinforcement learning from human feedback. So it was kind of a natural pivot to be like, well, I might just do that. And Hugging Face was a good place for doing that, because the whole company is kind of all for that, which is like: figure out how to support the community on the hot thing and build platforms there. So they were very happy about that. And I helped build a team at Hugging Face. And then I was kind of burnt out on the remote-work time-zone thing and found out that the Allen Institute was doing such similar stuff. And I was like, wow, I have people that could be in-person friends and do similar things. I was like, quality of life — I need to do this. And a few years later, I ended up building a bunch of models. And I think being at a nonprofit opened me to this ecosystem vacuum of information, where there aren’t many people who can talk about what they’re doing. So then, with some luck and committing to write every week, I just feel like my influence filled the vacuum of nobody saying reasonable things. And it is this nice synergy between what I write about and what I work on in my day job, and it just kind of got bigger and bigger in a very fun way. I think that, generally, at the highest level, I’m motivated by wanting AI to go well on this trajectory. And I worry about a lot of near-term things, whether it’s social unrest in the U.S. and just kind of the massive hatred for AI — I think is a very big near-term problem — and then, medium term, concentration of power, because I think AI will be super powerful in ways that people don’t expect. So generally, open models are a nice way to curb both of them by being a bit more transparent to people, and it naturally is a hedge against concentration of power. There have been different reasons throughout that, but that’s kind of a recurring theme in my life in the last few years. Grace Shao (02:50) Definitely. I love your work because I think you help non-technical people like myself really understand what’s behind what’s happening in these labs a lot better. And then I actually just spoke to your former colleague, Tiejin Wang, and he was with APAC Hugging Face just last week. He was saying the same thing. Open source, in many ways, is kind of the best way to go forward as we know that this technology will not stop evolving, but it’s the best way to kind of put up guardrails and checks and balances for the monopolies. Okay, I don’t want to take up too much time on that side of things today because our focus really is about your China trip. Before we get into the weeds of all that, I want to hear about the trip itself. Most people who are writing about Chinese AI are getting their information secondhand. You really went there, you spent time with the researchers, you met with people who are building the models. Tell us about what you meant when you said you came back with great humility, right? Your eyes are a bit more open, whether it’s the good or the bad. Tell us about your trip. Nathan Lambert (03:50) I feel like I kind of went in — I mean, I had this horrible English phrase in my writing, which was like, “I knew I knew nothing about China,” which kind of tried to indicate that I knew going into the trip that I knew nothing. And it was still the fact in my current writing. This is a horribly written sentence that I had in there. And I only talk about it because somebody called me out on it. It’s like, what is this? And it’s like, leaving, which is knowing that it’s such a big country, there are just such vast amounts of talent working on these problems, and how unpredictable it is as a human to model people with very different worldviews and upbringings and training systems. Realistically, the way that people are trained in China is very different. And I just think that even being there, you can’t fully grasp: what are the pockets of three to six researchers doing that is actually a bit different than in the West, even if they’re working on the same goal? I think you could get down to that level of granularity and a sociological study and actually see differences in what they’re working on, and that’ll always change the output. I didn’t get to that level of granularity, but it’s just to start having real experiences and understanding how people explain how they work on these problems. And for me, realistically, a lot of it is coalition building, which is just like: I want there to not be vitriol at the level of the technical companies doing things in international bodies. So just meeting all the labs on both sides is really nice, because you need to do that for them to talk to you about more sensitive issues in the future. I got some criticism on the piece, which is like, this is how you shouldn’t visit China. And it’s like, well, what are you going to do if you’re going on an official visit to a bunch of companies? How do you expect to get in the door without being nice? You have to start somewhere, and I think it’s important to be respectful. Grace Shao (05:31) I think the piece was, frankly — I don’t think the criticism was fair, to be honest, because I think you were really transparent with the fact that you’re not a China person, right? It’s not like you’re going there and exoticizing everything. And if anything, a lot of people, even with China backgrounds, like to use certain dragons and tigers to describe things. I feel like you actually were really humble going and being like, I’m just a technical dude meeting with these labs, talking about their technical research, right? And then because you were physically there, you had observations of the culture and the people. So yeah, I actually thought your piece was quite good. And yeah, sorry. Nathan Lambert (06:05) I agree. I was willing to let that sail past, but I think it’s important for people who listen to realize how actively these companies are trying to court Western audiences, which is why we could get in the door. I mean, we had some prominent people on this trip, but that’s why we got all of them in the days that we wanted them, except for DeepSeek. So essentially some, like Catherine Rintel, who works with me at Interconnects, and some other creators... Grace Shao (06:23) How did you get everyone? Yeah, how did you get everyone? Nathan Lambert (06:29) He used to live in China and has connections in China. So he kind of orchestrated the mix of his connections and leveraging my connections to labs. We had some bigger names on the trip as well. Just stringing all of these together to get all the various labs in place is a few months of networking to make sure the trip lines up with people with established networks and contacts with the various labs. But these people want to look good to Western audiences, so they’re only going to say yes to the right researchers. And the researchers know that there are two to four comms/ops people in the room, hanging out, making sure that it goes well. Especially the bigger the company, the more comms people. You go to Alibaba and there are three to five various people, from the head of comms to some special offices. You’re not going to get these people in the office, or at all, without accepting the cost of these types of handlers. It’s the same thing in the U.S. You’re not going to just plop a senior executive into a chair. So it’s also good because now I have the WeChats of a bunch of researchers from China that I could just text about things. It’s like, hey, congrats on the new model release. It’s like Lay Lee works at Xiaomi, Xiaomi MiMo. It’s like, talk to this guy for an hour at a mall — I don’t remember the name of the tea store — but it’s like... Grace Shao (07:27) No, of course. No, of course. Nathan Lambert (07:49) Now we have these relationships, which is very useful, and that helps information spread across the ecosystem to these trusted parties, which doesn’t really exist. There are not that many, I think. And the opposite direction of the trip is very hard because Chinese researchers can’t really enter the U.S.; the visa purgatory is too complicated. A lot of us on the trip were either Canadian or entered on a transit-without-visa entry, which makes it very easy for American technical talent to go to China right now, which is why I think there are so many trips. I think there’ll be more of them. We’ve got a lot of inbound from VCs and open-source labs in the U.S. that want to establish collaborations with these various labs because they’re the best open-weight models, and they want to build a stack for companies in the U.S. building open-weight models. So I think there are going to be more prominent, but not gigantic, U.S. startups going to try to build these relationships, which I think is a really interesting technological development because we’ve never seen this type of professional work trip in China from U.S. tech companies. Most tech companies have a “bring a device to China, it auto-bricks itself, and you have to hand it into IT.” So to actually proactively send people in a professional capacity is a really big change. There are a lot of angles you could take this, and I think it’s cool to see how it unfolds. This isn’t even really about the trip. This is the follow-on that we’re hearing from people that are like, hey, how’d you do this? We want to do this trip. Grace Shao (09:06) Yeah, definitely. Actually, from my end, I hear about VCs or investors always being quite active going to China because previously American funds were very, very active during the internet era. People were kind of always trying to find a way to either get into these good deals or potentially keep their pulse on it. But I think it’s really, really positive for the whole AI ecosystem to have this kind of fair, transparent exchange in some capacity. But to your point, there’s no way that star researchers can come out and talk to you off the record without any compliance, because that doesn’t happen in the U.S. either. That’s just companies protecting themselves. I just think your trip was quite meaningful, and I want to bring it back to your observations. You talked a lot about the cultural aspects of it. You talked about how you felt like in China there was less of this star-researcher celebrity status around people. People were more humble, or there was more humility. It was very focused on execution. You argue that Chinese labs are particularly well suited to the current LM-building game because they’re very focused on meticulous stack-level work. And there’s less ego sometimes to work on the dirty work, or the non-sexy work. So kind of unpack that for us. Why do you think that is? You kind of touched on it — you said they were brought up differently, they were taught differently — but what’s so different? Nathan Lambert (10:27) So essentially, an interesting part that synergizes on this trip is that we stopped by some academic institutions. I think it was like AIR and Tsinghua and stuff. And you hear all of these academic leaders talk about how they’re pushing hard to try to change it. So yes, they know China is producing more papers than anyone else, but they still think that it’s not as transformative of research. And they think that they’re trying to cultivate the academic domestic ecosystem to change just the type of work it works on, and the distribution, and take more risk. And then you would talk to some industry leaders off the record behind closed doors, and you would hear things like, it’s never going to change because the education system is so structured. There are so many layers of the funnel that reward things like memorization and stuff that they’re just like, this research culture is not going to emerge. And then the follow-on with the AI labs is that these labs are doing fast-following. They kind of have a proof of concept, and they know what it needs to look like. Therefore, in that domain, you’re not trying to invent the new paradigm. You’re not trying to make the model that is o1 or o3, or the first model to work in Claude Code. You’re like, I see it, and I’m going to try to do that and make it the best thing. And I’m going to try to make it cheaper and just maximize that goal. A lot of companies don’t need to invent the new paradigm. OpenAI has done this so many times. That’s their bread and butter: never doubt OpenAI’s ability to release a blog post and a plot that changes how people think about AI. I still think it’s going to happen a few times in this massive boom over the next four years. OpenAI just kind of has that sense of what is the thing that you can push on a bit earlier and just transform things. But I don’t expect — and other people wouldn’t expect — the Chinese companies to do that as much, because it’s just such a culture of, I guess, building. I don’t know how to describe the positive version of this. Maybe it’s slightly more practical-minded, in terms of: it’s your job to build this thing. A lot of the researchers, maybe because they knew their managers — some of them had managers in the room — see their role in the company as being to make the models excellent. And especially for students, I work with students and that’s what they say. I work at the Allen Institute and we have students that will co-lead our language models. It’s not that surprising, because if you do an industry research job in the U.S., a lot of mentors will tell you that you’re kind of free of the burden of bureaucracy and politics. So the naivety of students, and the simplifying, is actually so good at just getting a lot of technical work done. There’s also the life-stage side. If you’re younger, you don’t have as much family, and you normally haven’t built up as many habits and other things you do with your life. Language models are so complex, and the amount of context that you need to absorb to understand what the bottleneck is — there’s so much information, and you have to be able to pick what the bottleneck is and break it. If you just don’t have the mental space to absorb all the context, you kind of end up doing things that are cute but don’t make breakthroughs on the model. So that’s kind of a difference that I’ve seen in people who were both very successful academically before language models. Some of them are able to pivot to this practical mind, which is: what is the state of the system? How do I improve it? And then some try to make kind of these abstract frames of what’s happening and approach it like an academic, and it normally doesn’t improve the model as much. So I just kind of see, if the academic system is a bit more practical-minded, a bit more structured, and the work you’re doing is structured in the language model — make this kernel implementation faster, make this idea work — then maybe it can be... I think it’s an oversimplification. I push on that a bit in the piece just to really contrast what you could think a U.S. lab would look like. And I have a few anecdotes. I’ve heard a U.S. lab paying off a researcher to be quiet about their thing not being in the model. All of these one-off things are more storytelling devices than anything, because most one-off things don’t matter at all. But also Llama 4 imploded, and that was because it was described as a Game-of-Thrones political-style environment, with all the VPs vying for influence and showing that their thing made the benchmarks go up. It kind of fell. Many, many people will tell you that. And we’ve had the Qwen turnover, but it doesn’t seem like it was quite the same type of thing as Llama 4 or xAI. xAI barely exists now. There have been some dramatic things in the U.S. with how these companies have kind of come and gone out of the fold. Grace Shao (14:55) Yeah, I kind of agree with you, but also I would push back on that. I think there’s obviously a more rigid and competitive academic system, which by default in East Asia results in a culture of students following the bureaucracy and authority a bit more. So I agree with you in the sense that they’re very pragmatic. They focus on the task that is given to them. However, I wonder if things will change with how AI will disrupt education. That’s number one. But also, a lot of the young researchers that you’re working with today seem quite different. At least a lot of the entrepreneurs I meet today are born in the ‘80s and ‘90s, some even younger and born in the 2000s. And I think there’s a kind of aura or confidence coming from them. If anything, you want to say they’re a bit more individualistic-minded. You went to Shanghai, right? They are dressed very, very uniquely. They have these outrageous outfits on the streets. People are seeking individual ways to showcase their personality. So I wonder if that will shift. But for sure, for the academic institutions like the Tsinghua and the Beida of the world, they are still very old-school. But I would say that is the same maybe in some academic institutions in the West still. Okay, I think on this topic we can go off on a tangent on academics, but let’s go back to China’s ecosystem. When DeepSeek V4 came out, we talked about it offline, the two of us, quickly about a piece I wrote saying how DeepSeek is starting to look a bit more like a base layer for China. And if anything, some of the labs kind of admitted to that. They’re like, we have very limited resources. And to your point earlier... Nathan Lambert (16:11) Yeah, you could take that in so many tangents. Grace Shao (16:34) Limited people — these labs are tiny. They’re run by 100 to 200 people max. Limited capital, obviously limited compute. They have constraints all around. And in that sense, in a way, the ecosystem’s looking less like a zero-sum game and more like different players optimizing their own strengths. So correct me if I’m wrong, but DeepSeek is providing a base layer where a lot of labs will quickly follow and basically adopt a lot of their engineering breakthroughs. And then Zhipu, Z.ai, will focus on the coding; MiniMax focusing on the multimodality, et cetera. There are a lot of these different players. ByteDance, obviously, very, very focused on their video models. And Qwen, like you mentioned, had the whole open-source saga break apart with Lin Junyang leaving. But in general, they’re still kind of the leader in hyperscalers on that front. So everyone’s doing their own thing almost, instead of really... Nathan Lambert (17:27) I agree with the people specializing, which I think is normal business evolution. You figure out a bit where you’re good at. And there’s so much opportunity that they are like, okay, I’ll follow this because they see that they’re good at it. I just am more skeptical of DeepSeek as a base because I have no idea what DeepSeek is doing. And some of the labs when we were there, because DeepSeek V4 had just come out, were like, yeah, we look at the things they’re doing, but they seem more intricate than needed. And if you read the paper, there’s just so much going on in this model. As a researcher, I’m like, some of it seems a little fake or a little dependent on their setup and not necessarily going to work in every model. Grace Shao (18:04) What does that mean? Break it down for me. Nathan Lambert (18:18) Essentially, I will say that building an LLM is dependent on where you have your GPUs, your pre-training dataset, your intended deployment setup, and stuff like this. So you make decisions based on your constraints, and you build the model. DeepSeek has these constraints and they end up with their model, but Moonshot and Zhipu have different constraints, maybe more flexibility, and they ended up building a different model. They will test the DeepSeek innovations. So they’ll say things like, X innovation doesn’t improve our model. These two organizations are on different development paths that have core similarities, like these large mixture-of-experts models and the general methods are similar, but a lot of the parts end up being a bit different. That’s why I’m like, I don’t know exactly. If DeepSeek was a base, you would see the Chinese labs just do post-training. We just take the base model that’s out there and we adapt it to our domain of specialty. And we have users that do that, which is something that I think about a lot. I’m thinking about starting a post-training lab and how to format post-training research better. So I think about this a lot. I think about what a shared base actually would be. They go through — some of these labs put an extreme cost on creating their base model. And if they didn’t need to do that, they wouldn’t. One of the labs told us how long their pre-training run was, and my jaw dropped. I was like, that’s way too long. Any U.S. advisor would be like, you’re taking way too much risk on this pre-training run. If they didn’t land that pre-training run from one of these past big MoEs at a Chinese lab, I don’t know if the company’s dead, but that’s a huge amount of time. Most U.S. companies now know that you don’t want your big pre-training run to be more than a few months because it’s just so much risk and time to put all your eggs in that basket. That’s a sign that, in that case, they don’t have as big of a peak-size cluster. Essentially, pre-training time can come down a lot when you have a bigger overall cluster; you can just get more throughput on it. But if your biggest cluster is smaller, it’s harder to get a certain amount of throughput, so you use that one for longer. That’s a compute constraint. To loop it back, I think the specialization is real, but I’m more like, I have no idea what DeepSeek is doing. I know they’re raising money now. I don’t know what the plan is there. They seem the most without a specialty in the Chinese ecosystem. Grace Shao (19:59) Dependency on. Yeah. Mm-hmm. No one knows, though. No one knows. They’re secretive. But that’s my point, right? I feel like they’ve been kind of nationalized, whether willingly or not, because they’re taking the Chinese government’s money. They’ve kind of gone secretive. And it’s not like there’s a secret that they prefer Chinese-educated researchers. They’re keeping a very domestic stack, from talent to capital to the whole stack. So to me, it seems like they’re being Huawei’d, in some ways, because they did well and they got their name globally, and then by default they’re becoming the next Huawei, willingly or not. Nathan Lambert (21:01) I don’t think nationalization makes you a base for the other companies, at least not at this stage. There could be something, but it’s hard to force. Grace Shao (21:06) But then you have some incentive, right? But then it is some incentive. You’re like, well, if you can propel one of the teams and propel the whole industry as a whole, it could be in your KPI or some kind of unspoken expectation. Nathan Lambert (21:17) The coordination problem is so hard. Essentially, both in the U.S. and China, even the open labs, what they do is they fork open-source code and match it to their internals, and every company does this. Therefore, all the improvements that could potentially be going to the open code and forming this base that is far more efficient — they’re not completing the feedback loop. I think China could be closer to it. If people really lean into DeepSeek as a standard architecture and DeepSeek shared their training code and all the specifics and how to do this, from a Chinese economic perspective, that would be a huge win because you’re just saving compute. But I think it’s too decentralized and too competitive to have that happen. It wouldn’t happen in the U.S. either. Grace Shao (22:04) It’s so cutthroat. Yeah. Nathan Lambert (22:08) Even though I think for open models to be closer to the frontier, it would be better. I talk about open models in the U.S. needing a consortium. But there’s definitely enough money to make a consortium in the U.S.; then you fail because the model won’t be good because you’re feeding too many asks into the model. That’s the only way to create a shared base. Grace Shao (22:25) Interesting. So it’s not really just commercial. Yeah. It’s not the commercial reason. Okay. So if you had to give a high-level commentary on each of the major labs, what would it be? If you look at ByteDance, Alibaba, Tencent Hunyuan, if they’re relevant, DeepSeek, Moonshot, Zhipu, MiniMax, Meituan, Xiaomi now being part of the ecosystem too. Nathan Lambert (22:46) You might have to prompt it or say more, but I could just kind of ramble through them, which is kind of fun. Alibaba: cloud-focused, understands that open models can enable more usage of platform. So I would say Alibaba is very, very cloud-focused. ByteDance: mostly characterized by everybody else being intimidated by them, and very user-focused, including multimodal. Kimi: vibes of the office were great. It would be one of the best startup vibes that you would visit among U.S. or China. Zhipu: very AGI-pilled, surprisingly cautiously excited about being entity-listed, even though they have no idea why they are, because they’re like, it stamps them as a big deal. And then there’s some... Grace Shao (23:27) I think they previously worked with SOEs. That’s the main reason. Or they still do, but that was one of their main sources of income. And unfortunately, because a lot of these labs spun out of Tsinghua, and Tsinghua is, for people’s context, in Beijing. It’s really close to the government, obviously. But the thing is, when it’s close to the government, it could mean there are three layers of agency underneath the actual government apparatus. But then people like to link it to the fact that it’s taking government money, so therefore they are suspicious. It’s very unfortunate, I think. A lot of companies get thrown into that category. Even companies like Lenovo and a few other Chinese companies have previously been called out by U.S. senators saying, they’re taking Chinese government money, but really it’s that their scientists or their research labs spun out of a certain government-affiliated or government-funded academic institution. That’s what it is. Anyway, yes, go on. Nathan Lambert (24:23) Yeah. Some more would be: Xiaomi — surprisingly great research vibes for a new team at a random company. They seem to be crushing it. Grace Shao (24:31) What do you think of Luo Fuli? The star researcher. Nathan Lambert (24:31) I didn’t get to meet her. I think she’s as close as they have to a star researcher right now. There’s the tier of star CEO, which there are obviously others — Dario and Sam, the analogies are there — but the star researchers, like the Sholtos of the world in the U.S., obviously you can come up with many more. She’s the closest you have to this. I need to watch more interviews. We’ll see. But she wasn’t in our meeting. But they just seem to be doing the right thing. They’re making general models. They don’t really have specialization yet. Florian, the person who helps me write about open models on Interconnects, and I took a detour to go see Meituan because we’re like, why is Meituan building these models? And they’re very practical about it. It was a less glamorous visit at a normal tech office. It wasn’t an official visit for them. They were like, yeah, we’re a major online platform. We obviously are going to use LLMs everywhere once we need to build our own LLM and specialize it to our products, which, surprise, is very practical-minded. I’m guessing there are many more companies in China like this. Grace Shao (25:39) That’s what Tencent’s saying too. It’s because they want to serve their existing consumers and optimize their LLMs for their own distribution and their own basic interface or activity loop. Nathan Lambert (25:52) Yeah. After I left, some people in the group went to Xiaohongshu, like RedNote, and they’re there. They’ve released some language models that are multimodal. They’re like multimodal data-processing things. So a lot of them are not that surprising. The startups just have different cultures. I have met some MiniMax people before, so I left the trip early before MiniMax on this one. But MiniMax was quirky. They have a ton of women in their company, which was very fun. And they have products. They’re maybe slightly more product-focused, but I feel like the quirkiness of the company kind of matches maybe Western confusion over what their products are doing and what they’re trying to do. But it kind of matches their language models that are a bit more efficient. Grace Shao (26:35) Well, they came out with a lot of very consumer-focused applications, right? They had Hailuo and Talkie, all these character companion-bot products before. Nathan Lambert (26:45) Yeah. And then the last one I went to was Ant Ling, which is also very corporate, but in a less intense way, because I think they see it as serving their own products, whereas Alibaba Cloud is like, this is the gold mine we have to win. It’s a much bigger deal for them than Ant Group. But a lot of these things, when you list them — I don’t know, eight to 10 companies — they’re all pretty reasonable with respect to the age of the company and what the company does best. There’s not as much confusion. Grace Shao (27:14) Yeah. And Ant is low-key best at medical chatbots right now, which I guess makes sense because everyone has access to Alipay. And then for seniors, apart from WeChat, it might be the only application they’re using on a regular basis. So it became the default medical consultation app, which is really random, but it’s their niche now. Yeah, I think you’re pretty spot-on. It’s pretty cool that you got those takeaways, even just meeting with them for a couple hours. Nathan Lambert (27:41) I have been reading about them for so long, so a lot of these priors are easy to confirm when they kind of fit with things you have seen. The Chinese showroom culture is so interesting, and also one of the most surprising things to have at software companies. It’s so funny. They’re definitely appealing to Western audiences. Z.ai had poorly translated merch. What was it? Something so — it would be borderline inappropriate translation in the U.S. It was like “ship big, go hard,” or something. Just some really weird translations. And they have live API statistics in their showroom. So Z.ai was like, we’re serving 5.5 trillion tokens a day. All the U.S. companies are so closely watched for when they announce token statistics. I know at least one of these numbers is wrong. It’s something like Fireworks does either 30 or 300 trillion tokens a day — or I meant Together for that one — and then one of Fireworks or Together, and the other one, are like 100 trillion tokens a day. Don’t take these as sourced; go look them up. There were some public announcements recently, but those were the first updates that anyone has on major infra companies in the U.S. Inference is a huge market. You don’t hear anything from Fireworks because they’re just struggling to demand and they’re making bank, because inference is a much better thing to sell than bare metal. Essentially, inference is selling the software implementation to serve tokens more efficiently, and you can just get more margin when you improve the stack for a fixed model. So a model comes out and you host it, and then you can make your stack more and more efficient on that model. You just get more margin and hopefully growing usage. That’s way different than GPUs, where the best case is that you lock in a huge commitment for a long term. Just being able to walk into an office and learn about their API is interesting because they also had geographic distribution, which was like: China was, I don’t know, two-thirds; U.S.A., 20%; and then the last percent was Singapore, Korea, Japan on the Z.ai API. So that’s cool. This is s**t that I always want to know about the companies, and I have no idea. One of the things I always want to know is: how are open models being used outside of the U.S. and China, and has this decades-long process of technological diffusion started to kick in in a way that any company can measure? I don’t think anyone has good data on it yet, but I think it’s obvious that at some point, open models that are cheap to run are going to have some interesting playbook across the globe for the long tail of countries. Maybe I’ll just walk into the front door of a Chinese open-weight company and get my answer. Grace Shao (30:30) But actually, I think the culture of these labs — a lot of them, because they’re run by really young, passionate people — you would feel like they’re a lot less commercialized or less corporate, or at least less sleek. They’re not sophisticated with, you can say, the capital-market side of things, but you can also say that they’re just really naive and open-minded and passionate about the product they’re working on, with less of a corporate guardrail built around them. Nathan Lambert (30:56) Yeah. Grace Shao (30:57) Okay, I want to talk about... Nathan Lambert (30:57) Yeah, go ahead. It’s like one of the people at Z.ai who’s known on X — I don’t know, 9,000 followers — it’s like Lu. She came up and was like, hi, I’m a student, I’m 20. I’m Lu from X. And I was like, that’s hilarious. There was a lot of s**t like that. It was like, oh, okay. I don’t want to call her a kid, but it’s like... Grace Shao (31:06) Yeah, yeah, yeah. And I think the one that runs Moonshot’s developer ecosystem or something is literally a girl fresh out of school, right? And she just posts hilarious memes all day long. There’s no filter on her social media. It’s funny. Okay, we go on these tangents, Nathan. We need to come back on track. Open source, open weight. Why? Why do you think Chinese labs are adopting it or embracing it, however you want to put it, especially after visiting them? Is it because they simply have to, because of what we talked about — they are leaning on each other because of all the constraints they have? Or do you think the philosophical drive is actually bigger in that ecosystem? Or is this a bigger strategic thinking for diffusion in the long run? Nathan Lambert (31:54) I actually don’t feel like it’s that special ideologically. I think it’s easy to say the ideological line when you are doing it. Now you can look at Zuckerberg: he said the ideological line when he was doing it, and then he stopped. I think it’s mostly just that, for one, distributing within the U.S. ecosystem, especially to enterprises, is the highest-value market, and they can’t sign many enterprise deals. And the closest best thing is things like Cursor adopting Kimi’s model. Even if Kimi doesn’t get paid for that, they’re happy. That’s the biggest sign of credibility for them, and they can figure it out in selling tokens or whatever in the future. Practically speaking, one, the only way to influence the U.S. market is by releasing these models. And two, it seems like they don’t feel like they’re losing as much if they release and share things. If the model was closed, they just think they would get less influence, they would be seen less, fewer people would use the model, their actual paid offerings would be adopted less. It just seems almost overwhelmingly obvious, because there are all these benefits and not as obvious of a drawback. There will always be better models, and just keep going. But I think every scientist loves... Grace Shao (33:08) Then why are so many U.S. labs against it, or not willing to? Nathan Lambert (33:12) Because they can make as much money without it. Anthropic and OpenAI make more money by not releasing them. They can just make so much money, so why bother thinking about an open model that doesn’t make money? There are different scales of influence. Same with Google. Google’s making so much money. I think Meta will make a lot of money by having good AI models in their products, if they get their act together. Even Google could release more models. They have so many surfaces other than Gemini that need AI to be commoditized and used, like the cloud and all of this. Meta could release the models. It’s just not worth the effort for some of them. They’re like, we need to do this high revenue target; it’s too much of a pain to go through legal and make it ready to release. Why bother? I don’t know, maybe it’s a little bit of a cynical take, but I think Microsoft and Meta could release their best models openly because they benefit if it’s a commodity layer. But I don’t expect them to, because it’s just kind of like the benefits of focus are so high, and they just kind of see it as something they don’t have to do. Grace Shao (33:56) And it’ll be good for them. But then eventually, we will see some consolidation in the market as well, assuming — because you can’t really have 10 labs in each dominant country right now all exist. Nathan Lambert (34:26) I do expect consolidation. I think this is potentially a subtle cultural point, which is that the U.S. labs are more likely to buy into “we’re special, we need to go fast, keep it closed,” and the Chinese labs are not. There could be something there. That’s also who the decisions funnel up to. I don’t know. I talked to the Alibaba people that make these decisions. I can’t say all the things that they say about them. Some of these were two-on-one and off the record, so I can’t say all these things. But at all the other labs, there is a person that makes the call, I’m guessing. I think those are senior leadership that we’re not talking to. So it’s kind of hard to know exactly what they really think. I definitely expect consolidation. My thing is that I expected it in China faster because the capital markets aren’t as strong as in the U.S., but I don’t have a model for that. I think you can model it, which is: what do you think the revenue growth would be? What do they need to do to raise to keep training bigger models? What is the compute cost? Then you look at the potential raises and think about which country would not be able to do that race first. But also, it’s this wild thing with OpenAI raising $120 billion. Are you kidding me? What is that? Grace Shao (35:47) Yeah, the valuations in the U.S. are not really understandable by anyone else right now. I think in China — so on your point on that, I’ve been writing about this and I think it would make sense for Tencent just to buy out one of the labs. They have the money, they need the capabilities, and frankly, they’ve really been struggling to compete with their LLMs, with all the labs talked about just now. So my... Nathan Lambert (36:05) Their licenses are so bad. They release all these models that have horrible licenses. They’re not that good, and the licenses are just horrible. Grace Shao (36:13) So I feel like it financially makes sense for a company like that to optimize and just buy out a lab. Then the labs can also lean on their distribution, because at the end of the day, how are they going to win consumer mindshare or distribution in China right now when it’s really just dominated by Alibaba, ByteDance, and Tencent? That’s my spiel. But when I spoke to some of the researchers... Nathan Lambert (36:33) I think big companies have a lot of inertia, and the senior leadership has the call, and they can have inertia. I still think Apple just ends up buying some lab for $25 to $50 billion. It’s not the worst thing. Just golden-handcuff the researchers. Some will still quit. Grace Shao (36:43) Yeah. But I think right now they don’t want to. The labs still have a dream. Some of the researchers still have a dream. So when I spoke to a lot of them, they’re like, no, we don’t want to do that. We want to commit to our own frontier research. If I wanted to join one of the big tech companies, I could have. So why would I want to sell? That’s what the researchers think. But to your point, we don’t know what actually the one person or two people at the very top think, especially if they continue to have hurdles with compute access and capital access, which brings me to the question. Nathan Lambert (37:14) It also depends on your view of inference. You can ask your next question. I don’t need to cut you off. It depends on your view of inference. If these agents are just so much inference, I do think it’s going to be an oligopoly-style market, not a monopoly-style market. And what’s the difference financially between two and four or five big companies with great models? Is that actually not sustainable if there’s so much demand? There are a lot of cases where we have two or three, like the cloud, but what’s stopping that from being four? Grace Shao (37:40) I think they will be the infrastructure providers. Yeah, yeah. And they would kind of lean into each of their existing ecosystems or distribution, whatever you want to call it, and serve certain specific models for specific uses. So enterprises can choose what matches their needs the best as well. I do want to bring the conversation to a more contentious topic, which is on distillation and model convergence. You raise the question of whether Chinese models are structurally different. Often we are hearing claims saying a lot of these labs are about three to six months or six to nine months behind U.S. labs. There’s obviously a lot of noise or allegations and accusations from certain U.S. labs saying Chinese labs are distilling them. How do you actually see that accusation or that kind of dynamic? Nathan Lambert (38:36) The biggest unknown that I don’t have an answer to, which actually has a lot of sway, is how much of the Chinese companies are actively trying to hack APIs versus just showing up as a customer and paying. If you’re trying to hack the APIs, normally you get reasoning traces out so that you can create a reasoning foundation that would be similar to the model that you’re trying to do this from. That’s very different than the API standard form, which is just the output of the model, which is a less direct process for learning from. I don’t know the magnitudes. If it’s more just like, I walk up to an Anthropic API and I use it as intended, but I’m making a competitive model, I’m not very sympathetic to Anthropic. They could ban it if they want to. And I think the impacts are kind of a standard practice. You can do it with many different models and so on. The evidence Anthropic provided is not large enough scale where I’m like, this is industry IP theft at mass scale going on 24/7/365. So there’s definitely some gray area to what is actually happening in distillation. That’s why, on the policy side, I try to push people to not call all of it the same thing. Essentially, using any API endpoint to make synthetic data to train your model is some form of distillation, but it’s very different if you’re trying to break this model so that it gives us a different behavior that is hyper-useful for training and not get caught. Those are pretty different actions, and they’re all looped into this common phrase of “distillation” right now. That’s my biggest problem, which is that academic researchers and small companies use distillation extensively as the core of their business and the core of research methods. So if the U.S. government nukes that as a thing that could be done in the AI ecosystem, it’s mostly bad for small players, bad for U.S.-China tensions, and bad for academics. That’s my primary concern. And then trying to get the labs to actually say more. There’s a distillation side and then performance is the other side, which on benchmarks, it does seem like the Chinese labs tend to be six to nine months behind. When it comes to general use, I’ve always found the closed models to be better in ways that are hard to measure. So I go very back and forth on whether the closed models are better. I think we will especially see Anthropic and OpenAI pull ahead on knowledge-work tasks like legal, healthcare, financial services, because I just don’t see the Chinese labs paying for that data. All that data is going to be people that charge hundreds of dollars an hour to annotate and create these environments. So it’s a whole new capital build-out that goes on there right now. It’s going to be billions of dollars if you’re going to buy a billion dollars of data and a billion dollars of compute and a billion dollars of talent to train your model. Grace Shao (41:30) They don’t have the money. Nathan Lambert (41:30) I don’t think they have that. Mercor has some of these evals, and I think there is a bigger gap there. So it’s very interesting. Florian, the guy that helps me, and I disagree on it. It’s this fine line between, yes, the evals — coding and lots of these things, and even random evals that surely the Chinese labs aren’t training on — the open models really are genuinely crazy impressive scores. So I think there’s also a tester’s bias, where I don’t use the open models as much. Maybe it’s hard to ground in my head what I was doing with AI six to nine months ago. I wasn’t even using Claude Code as extensively. I guess the question is, at the end of this year, can I use an open model in something like Claude Code and feel like it works at all? That’s the test on the performance gap, starting in June, June to August, and whether or not that hits. I don’t think the open models have hit that yet. I think it would be way more of a narrative if all the companies spending billions of dollars on Claude are like, oh, we can spend 1% and just use DeepSeek. These CIOs and all the big companies — some companies spend more on tokens for their employees than on headcount. These are normally startups. But they would happily reduce that token cost to 1% expenditure if it really was that similar, because then you could just use 10x the tokens. I don’t expect that to happen. And I expect things like the latest Claude and GPT-5.5. I expect more of these things through the year, and we’ll see if I end up being right. Both are right at the middle of us, as a world, getting more clarity on them. They’re like 18-month-long stories unfolding, and I feel like we’re just in the middle of performance gap and distillation and learning more. Grace Shao (43:25) Yeah, it’s interesting. You mentioned — it helped me recall a conversation I had with other people as well. The point on distillation is that I just had a conversation with your former colleague at Hugging Face, who leads APAC, called Tiejin Wang. He was just saying, look, the distillation accusations don’t really make sense because we’re all distilling off of each other as we speak. I’m learning from you; you learn from me. We’re distilling. It’s so vague of a terminology to just use that to accuse all these various behaviors. So to your point, I think people in the technical world who understand what’s happening actually want more clarity on what is the gray area, what is actually black and white, and what is not appropriate or unethical. That needs, I think, the industry to come together to really put guardrails and rules around. Now, number two on the compute side and the data side. Something anecdotally will be interesting to you is that when I spoke to one of the lab researchers in Beijing, I think in February around Chinese New Year, they were saying, look, they want to get better data, but they can’t because usually a lot of American labs would pay tens of millions, if not even more, like a hundred million dollars, for a set of very obscure or niche datasets, but they would have an exclusivity contract. What the Chinese labs will do is that they will literally wait out the exclusivity contract and then, say two or three months later, pay for it at one-tenth or one-twentieth of the price for that same dataset. So then once they start post-training on that dataset, that’s where the three to six months or six to nine months come in as well. Yeah. On that note, I want to... Nathan Lambert (45:00) Yeah. I think the data industry in the U.S. has two things. One, the lab asks the data vendor, we need this specific type of data. And the data vendor is a network that connects the people to the lab. The other thing is the data vendors know evals that are important, so they try to create good data for hill-climbing on specific evals. That data could be sold to multiple people, but is less expensive because they make it once and expect to eat margin or take margin on it. There could be a pipeline where once OpenAI is at the cutting edge, creates this new thing, they create deep research, then the data industry is like, let’s make things that are a little bit cheaper to sell. So there is time lag in these various things. But I heard the same thing on the ground, where they have a negative view of the data industry. It’s like, quality is bad, we don’t really have access, we do some in-house. That’s a very big difference from today, which is that you have the data companies in the U.S., which is insane. Grace Shao (45:53) Yeah, the American data companies are so mature. It’s its own sophisticated ecosystem. Before we get into data, I actually want to ask you this question. I think recently a lot of the narrative is now saying, look, Anthropic and OpenAI have kind of proven that pre-training scaling laws continue to hold, especially with the recent models. There’s an obvious compute constraint on the China side that we talked about. And then it will likely be even more amplified with the absence of Blackwells in the coming months. So as we move forward in this race, per se, if you have to put it in China versus U.S. in that sense, will we see a wider gap between the performance and benchmarks between the Chinese labs and U.S. labs? As in, will we see the gap going to 12 months, 24 months, as Chinese labs are very, very constrained on compute for pre-training breakthroughs? Nathan Lambert (46:40) I think it’s more of pre-training as a thing that you could actually finish. How big can you pre-train a model that you can finish and serve? The Chinese labs could train models that look like GPT-4.5, which is this giant model, but you can’t serve it. They end up training a model that is 2.5 trillion parameters and they release it, and no one can use it. They could barely serve it on their API because they don’t have Blackwell NVL72 racks or something — these racks that are definitely what are serving these large MoE models. They just don’t have the quantity of these. So there’s a difference between models that you can build and models that are actually useful. I think some of the Chinese labs are definitely like, we don’t need to release the gigantic models because nobody is going to use them in open weight. The biggest models end up getting served via API. So there might be some segmentation in that market. But I do think the inference and amount of economic resources that you have to serve your customers is becoming a thing that dictates what models are built. That’s why I think the gap will continue to rise. All signs point to GPT-5.5 being a bigger model, and I don’t expect that to stop. And then the economics of it is just the basics of: you need a certain volume to have the margin to support the research, because you can’t keep raising these ridiculous rounds forever. I think OpenAI, Anthropic, and Google are the only people with that AI usage volume to keep marching down the scaling laws to another 10x of training compute, which is mind-boggling amounts of investment in a model. That’s why, when the economic markets slow for fundraising, the model gap between these big three will just show a lot more. That’s the distilled way to say my prediction of when things will look different. It’s like these labs can’t fundraise, they go public, they can’t generate revenue more on their paid services, and then it’s just: look at how much training compute can be allocated or can’t be allocated. Grace Shao (48:41) Yeah. Basically, we’ll see a bigger gap, I think, in the coming months. Then what can make up for that? Domestic chips, or, like you said, better data. And why is it that sometimes people assume China has a very strong data ecosystem or data products, but actually the data vendor ecosystem is very weak in China? Nathan Lambert (48:41) So generally, I think I agree with what you said. I don’t know on the data side, but the way domestic chips could help is that if Huawei chips are fine for inference, and if they have sufficient volume to support the inference economics, which then trickles back into revenue, my read is that they just don’t have the volume of the chips, especially spread out across the amount of companies that they have. Essentially, the total FLOPs of Huawei, all the things produced, and it’s going to all these different places — it’s just not big enough. It could be something like ByteDance and Alibaba, with offshore data centers, can keep up a lot longer because they have access to Nvidia compute and have for a long time through this kind of offshoring. Maybe that stabilizes the ecosystem, and we’ll see what the AI startup, the younger startups like Kimi and Z.ai, end up doing. No one wants to do this, but if they pool resources, they last an extra year. You get another order of magnitude if they all pool together, but I don’t see them doing it. Grace Shao (50:00) But that’s the thing we were just talking about, right? MiniMax and Zhipu, how can they possibly compete with the hyperscalers at this point if you need offshore data centers? And the fact that Zhipu is on the Entity List doesn’t help, right? It’s not going to be easy for them to access these data centers either. Nathan Lambert (50:12) Yeah, I think they can’t. I think they won’t. Human nature will make it so they won’t collaborate. They’ll just do something smaller. They’ll just have successful businesses that are different. Grace Shao (50:22) They just have smaller ambitions, want a smaller piece of the pie. Yeah. Okay, so you wrote something like, nothing’s a secret, but everyone wants Nvidia chips. They want it, they don’t know how to get it, they’re fighting over it. Nathan Lambert (50:34) Yeah. They’re the only thing that works for training. All the models are trained on Nvidia. I don’t believe the DeepSeek propaganda that it’s trained on Huawei. The only models that are trained on Huawei are tiny. Inference on Huawei works. Every lab is like, inference on Huawei works. The labs that don’t have meaningful inference are like, we are told to get Huawei, so we buy them, but we don’t use them. Earlier research labs are like, we don’t have any inference and we don’t have a need for Huawei. Any company that has meaningful use of their models has figured out how to run them on Huawei for inference, which, to Jensen’s credit, is like — it’s happening when he said it was going to happen, but it’s not that surprising. Grace Shao (51:11) Yeah, the Dwarkesh interview. I don’t actually understand why he got so much hate for it because even without your political stance, what he said actually made sense logically by saying, if you don’t sell them the crappier versions of what we have, they will have an equally quite crappy version to serve themselves, or they would just want... Nathan Lambert (51:28) I think they would buy both. Buying both is actually true. The amount of Nvidia chips that you would have to sell to China for them to stop buying Huawei — because Huawei is almost surely way cheaper because Nvidia margins are insane — when would they actually stop buying both? Grace Shao (51:43) But then you have to go on CANN. You have to reroute everything back on CANN. The developer ecosystem is not there. That’s Jensen’s point, right? Or the habits are not there. So I think that’s what, when I talked to a research lab... Nathan Lambert (51:50) Yeah. But I’m saying they would also use Huawei. I think they are so supply-limited, they would use both. Anthropic uses everything. A lot of companies in the U.S. will use multi-platform. Meta is a huge buyer of AMD. Demand is so high that any chip that is potentially viable on the models within a few generations is very valuable. And the fact that you can run some reasonably large model on any Huawei chip is a big line crossed for Huawei. I don’t know if they can produce the volume of chips and scale that quickly, especially as they try to move to lower nodes. That’s the standard semi debate. But the question is: can Huawei scale production? That’s the only question. And if Huawei can manage to scale production, Jensen will just look really right. If Huawei can’t scale production, Jensen will look a little bit like a lunatic, but it will be outside of his hands. Grace Shao (52:43) And we don’t really know what happened during this trip. It seemed like nothing really substantial happened after this big Trump delegation. It was more like a high-profile tourism trip versus an actual deal trip. Okay, I want to ask you something you wrote about that’s a bit niche, not something you usually write about. It’s on the SaaS side of things. You said that there’s a common argument that China struggled to monetize AI because they’re unwilling to pay for enterprise software. We looked at how China tries to monetize on consumer AI, but clearly that’s not really been proven yet. In your piece, you push back on the claim and say that there’s a distinction between SaaS spend and cloud or inference spend. Tell us about what you think about that ecosystem and how Chinese AI labs are trying to make money maybe a bit differently from American AI labs. Nathan Lambert (53:32) I don’t know if it’s necessarily different, but I ask a lot of researchers about this. They say that everybody is trying the new AI tools when they come out. If they don’t like them, they stop using them. If they like them, they keep using them on the consumer side. So something like Claude Code would be an example: tons of people tried it. I’m guessing lots of them churn in China, just like in the U.S., but consumers are very quick to adopt and try new things, but won’t stick if it’s not actually serving them. And then the enterprise is like: there’s definitely cloud that exists. Digital services are gigantic. They essentially think that there’s more runway for making money on AI models that falls into that. And they all use coding agents; they all use Claude. It’s a hilarious thing. They’re all very Claude-pilled. There’s almost no mention of Codex, where in the Western media, Claude versus Codex is this whole thing. They all use Claude. And that is obviously a paid service. So I think there are cracks in the argument, and I expect AI models to be seen as a bit of cloud, but potentially it is the thing that changes some of the expectations, where it’s just so transformative because they’re so competitive, and it could be seen as a bit of a phase shift. Grace Shao (54:41) Yeah, and I think it’s a generational shift, a phase shift. Also, actually, recently Doubao raised their prices on Seedance usage and whatnot, and it’s a shift into trying to capture the prosumer market. You can say the average uncle and auntie on the streets still don’t want to pay for a consumer app, but I think there’s more prosumer market share that could be captured in China, maybe not fully enterprise either. I want to ask you about government roles and geopolitics. I know there is a common narrative that usually people assume Chinese AI labs are heavily subsidized. Actually, when I was in San Fran in March, I was at a dinner with a couple of investors, mostly public investors, and one guy asked me, “Hey, are all labs just basically subsidized by the government?” I was like, definitely not. The majority of them are not. If not, they frankly don’t want to take money from the government. It was really hard for him to understand that, because I think the misconception is all Chinese labs or Chinese tech are just funded by the government. Kind of to our point earlier, where any affiliation to any government agency, just by default, is assumed to be therefore backed. First of all, the government, I don’t even know if they have that much money to give out. Number two, I don’t think that’s how competition works, right? So what’s your thought on all of this? Nathan Lambert (55:55) It seemed more like a provincial government trying to help the companies do stuff, which is like get offices, get talent. I don’t know what the provincial government can do. In Beijing, there’s Beijing Academy for AI or whatever, which is a real research institute that’s just funded by a certain neighborhood in Beijing. It was like, okay, the U.S. could do that. But much less of the Ant Group-style thing, which is government takes major ownership stake in an investment round and goes on. Maybe Kimi’s latest round, there were mentions of government-backed VCs, and I don’t know how that kind of intermediary works. So I still think it’s very indirect. And because the government system is so competitive across the different layers, each of those layers are competing to help the companies, but they don’t have piles of cash sitting around to buy GPUs. Grace Shao (56:44) No, they don’t. And they frankly don’t know what they’re doing half the ti

Comentarios

Sé la primera persona en comentar

¡Regístrate ahora y forma parte de la comunidad de Differentiated Understanding!

Prueba gratis

Nathan Lambert Reflects on China’s AI Labs: DeepSeek, Open Models, and the 'Race' with the U.S.

19 de may de 20261 h 3 min

Nathan Lambert Reflects on China’s AI Labs: DeepSeek, Open Models, and the 'Race' with the U.S.

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios