Differentiated Understanding
In this episode, I sit down with John Wang, the co-founder of Assembled, to explore how AI is revolutionizing customer support. Having transitioned from a Stripe engineer to an AI startup founder, John shares his unique insights into the evolution of support tools. We delve into how these tools have shifted from being mere cost centers to becoming strategic assets that enhance customer experiences. John and I discuss the impact of AI on support volumes and staffing, highlighting how integration is reshaping the landscape. He emphasizes the importance of talent density and assembling high-caliber teams to drive success in the tech industry. Through his experiences, John provides practical insights into AI's current capabilities and limitations in support operations. We also explore the strategic considerations for future AI support ecosystems. John shares his thoughts on the role of support in driving revenue and customer satisfaction, and how AI can orchestrate with human support agents to create a seamless experience. His perspective on building high-performing support organizations offers valuable lessons for anyone looking to innovate in this space. Every episode, I bring in a guest with a unique point of view on a critical matter, phenomenon, or business trend—someone who can help us see things differently. Season two will host a series of guests from early-stage investing, as well as builders, founders, and product managers. For more information on the podcast series, see here. [https://aiproem.substack.com/p/launch-of-differentiated-understanding] To find the previous episodes of Differentiated Understanding, see here. [https://aiproem.substack.com/podcast] Chapters 00:00 The Journey from Stripe to Assembled 02:25 Understanding the Importance of Customer Support 05:29 Lessons Learned from Stripe 10:25 AI in Customer Support: Current State and Future 16:04 The Economic Impact of Support Operations 18:25 The Role of AI in Transforming Support Jobs 24:30 The Future of Support Organizations 26:58 Guardrails Against Fraud in AI Support 32:42 Navigating the AI Ecosystem 38:00 The Value of Long-Term Commitment in Careers AI-generated transcript Grace Shao (00:00) Hey, John, thank you so much for joining us. I just recorded your bio already. It’s extremely impressive. And you’ve done quite a, you’ve had quite a few different roles now as the co-founder of assembled, right? To start, can you just tell us about your story? Like what inspired you to leave Stripe, you know, go into, you know, right now what you guys are doing, which is a software for people who run customer service support operations. You know, now you guys are pivoting into AI as well, or at least leaning into AI. Tell us about all of this. John Wang (00:28) Yeah, great question. When we, well, when my co-founders and I started, we were all at Stripe. We worked on a bunch of different things at Stripe. And one of the last things that my two other co-founders worked on was a support tool, an internal support tool. And I remember pretty clearly that they were making a bunch of headway. It was really, really cool. And... They had gone to this really, really high up person and product. And this person was basically like, why are you guys wasting your time on this? Like you guys are kind of like, you’ve been at Stripe for so long, you know all these things and you’re doing support. Like I’ve got this really cool Bitcoin project that I would love for you to work on instead. And I remember my co-founder coming to me and being like, hey, like pretty bummed this is what happened. And then I was like, wait, you just saved Stripe, you know, quite a few million dollars, increased customer satisfaction by 40%. And still they don’t understand the value of this. And that’s when we were like, hey, ⁓ there’s something here where there’s a market opportunity. So that’s what got us really, really excited about support. We were doing it at Stripe. We knew it was an undervalued place. We didn’t see any very good tools out there to do support well. And so we decided to go build something really, really great in the support space and just like make transform and elevate support is our mission. Yeah. Grace Shao (01:50) Do you think it was just that stripe was too rich? They were just, and they just didn’t care about saving a couple million dollars? Or do you think it was actually a blind spot for people? John Wang (01:59) I think Stripe was definitely very rich at the time. think it was also a blind, it was a combination, right? Because most people, you think of support, you think of it as just a cost center. And I think recently that started to change in the sense that like, hey, this is actually a really important part of your business. But for a lot of companies, like if you look at FinTech, if you look at like a lot of health tech companies, their entire product is their relationship with their customers. And so support’s actually really, really important for that. And I think a lot of people underappreciated that for quite a while. And now I think people are starting to understand again, hey, if you piss off your customers every time they come and talk to you, that’s not going to be a very good thing. You better be a monopoly. Otherwise, you know, they might not be coming back. Grace Shao (02:46) Yeah, definitely. I think I want to kind of lean into that later in our conversation as well. It’s like people are trying to replace support and customer service AI first. But if anything, it’s not the best experience when you’re frustrated with a product and you keep on getting a robot, right? But I want to kind of talk more about your experience at Stripe. You were there quite early. What do you think it taught you, you know, as a very early employee at such a successful startup now? if even considered still startup and then like what were things that you think you learned there lessons even if soft skills that you kind of took away to to your current role like as a founder. John Wang (03:21) Yeah, it’s a great question. You know, it’s really funny actually. I just met up with someone where, so when I was starting out of college, I had applied for all these jobs. I was able to get a lot of them, except for this one company that I really, really wanted to go to. It was called Meteor Development Group. They built open source software. In college, I had built open source software at Ruby on Rails. I was really big in that community. was like, wow, it’d be awesome to go and make this something I do day to day. And I didn’t get the job. I was really bummed about it. And then I was like, I’ll just fall back on my second here, which is Stripe. And Stripe was the obvious second choice because just the people were really, good. And now like 10 years later, I like think about that and I’m like, the business model is really important. because Meteor was not a good business. Like open source frameworks is not a good business, but Stripe, really boring. Honestly, it’s just like payments. You process payments, you go talk to Visa. You literally have to like, we had a server in the server room that would send like a specific file with specific tabs and spaces in order to get it out to Visa. Really boring. Really, really core infrastructure too. And so like the big overarching thing that I learned was like one, business model is unbelievably important because if you can just make a good product when the kind of like market is there and when there’s a really big need, then this can scale like unbelievably fast. Two was the people. I remember talking actually to a few people, Greg Brockman was maybe the second or third person I talked to. who’s now the co-founder of OpenAI. And I remember just talking to him and being like, wow, this person is so, so smart. This is awesome. And I would talk to kind of like, I would go to the lunchroom and be talking to people at Stripe. that was just, people were talking about all sorts of things. And I think like talent density was a really, really big part of like what made Stripe successful. And It wasn’t any one thing over time. was one, Stripe was in a great market. And then two, it iterated really, really fast on a lot of little things over and over and over again. So I thought that was a really good place to learn a lot about like what makes a company great. Grace Shao (05:49) Yeah, I think it’s interesting you’re talking about talent density and a lot of the AI labs I speak to actually also talk about that. But I’m curious, what does it mean when you have really strong talent? Is it like that they are technologically superior, like they can code better? Or does it mean actually that they can think outside the box, they’re more creative, they can pivot faster? Like what does it really mean to have really high caliber talent on your team? John Wang (06:11) I think it depends on what company or like what you’re trying to solve, right? Like talent density for Los Alamos national, like Los Alamos, like building the atomic bomb is like very different than talent density for like Bell Labs, which is very different than talent density at early Stripe, which is very different also than talent density at OpenAI Research. Like I think for Stripe in particular, the type of talent density that was there was really high curiosity. Grace Shao (06:31) Right. John Wang (06:38) really high product thinking, really technical people, and people that could dive deep on certain problems and weren’t afraid to go talk to a bunch of customers. You saw so many conversations about like, how do we make this particular API parameter better for everyone? And like hours and hours and hours of like making sure it was a really, really good product. And people who weren’t afraid to like, you know, take a week of work and just like dump it away because it wasn’t quite there. So it was like, This combination of like, they worked really hard, they’re really smart, and they care a lot about the end result and have a high quality bar. That was Stripe’s version of kind of like talent density. But I think like, you know, if you look at the labs, if you look at different research institutions, maybe it’s just, you know, I don’t know, the raw ability. Yeah. But. Grace Shao (07:26) research capabilities or whatnot, right? No, that makes a lot of sense. Yeah, I wanted to ask you earlier on in our conversation, you said, you know, look, a lot of people overlook support. It’s not that glamorous. People kind of think it’s like a back office thing. But, you know, is that was that your view back then? How does you kind of, I guess, lean into this? And did your perspective or support change over the years? Now you say it’s very important, right? Did you understand the category correctly? Do you think? John Wang (07:52) You know, I think that when we looked at the category, we went at it from like kind of the lens of, Stripe was this company that worked in this unsexy space and did really, really great things. And we thought very similar things about support. It took us a long time to really grock support. And we talked to hundreds and hundreds of different people across different parts of the support stack. And I think early on, honestly, it was good and bad in certain ways. It was like, we thought we could build a piece of software really quickly that solved everything. Or like, you have that problem, we can build that in two weeks. Not a big deal. And we could solve the specific problems that they had in two weeks. And I remember talking to actually a few people, which was like, the system that you’re trying to do, which is called Workforce Management for Support. that’ll take you seven years. And we’re like, no way. Like we can do this. We can do this so fast. It’s going to be done soon. And now like seven and eight years later, we’re still working on it. We’re still uncovering more and more things. And that was probably the right, you know, that was probably the right call. But also there’s like some importance to naivety, which is like, if we had known that we wouldn’t have started. like we, yeah, like. Grace Shao (09:06) That’s why a lot of people say, yeah, as founders, right? John Wang (09:09) Yeah, so I think it was the right thing to do, which is just like start building stuff. Grace Shao (09:14) It’s amazing. ⁓ Why don’t we pivot into actually understanding your product bit better? So for someone who has never worked in support ops, what is the simplest way to explain to them what Assemble does? Because even between us, we had calls, we had back and forth emails. I was like, John, I don’t understand what you guys do. I’m trying to read through this material. I’ve listened to a few in the interviews. I don’t know what’s happening. Can you just dumb it down for me and explain to me what exactly you guys do? John Wang (09:37) Yeah, for sure. Let’s say you have like 10 people on your support team and you only do email, then you probably just staff them nine to five, right? Like there’s no big deal there. Once you start having a few more people on your support team, let’s say you have hundred people now and you might want to chat to your customers because AI chats, AI chat bots are a really big thing. Then you actually need to start thinking about when do these chats come in and how many people do I have in order to handle those chats, right? Because like, if you were talking to a chatbot, you’re getting instant responses back and forth, back and forth. And then you’re like, I have to wait 48 hours for the human response after I get handed off. That’s a really bad experience. So the problem is, you you’ve got a bunch of people who are calling in to support, writing in, who are chatting in. and they’re coming in at all different times of the day, they’re calling in for different types of problems, right? You, on the kind of like back end, you have a bunch of people and those people might be able to do different types of things. Like I might be a really good person to handle, you know, where’s my money kind of issues, but I might not be as good at like ⁓ fraud issues, right? Like if you’re having problems with fraud on your account. So there’s a lot of ways in which you can actually put people to the actual incoming tickets. And what our platform does is it tries to match those two things up. if you think about supply and demand, supply is the people that you have and demand is the people, like your customer is asking for questions. And if you don’t match those up well, you’re gonna either... spend way more money than you need to because you’re just going to staff everything way above what you need, or you’re going to have a terrible customer experience because it’s going to take you a really, really long time to get back to people. So it’s really an efficiency play. How do we make it really, really efficient for you to answer questions? In the last few years, we’ve also added AI agents, which is, you know, how do you actually respond instead of just with people, but also with AI to go and answer a chat or answer a phone call directly using AI. Grace Shao (11:50) That’s amazing. I really didn’t know there was so much like science kind of going behind that. I just thought kind of like you’re on a chatbot usually you have to have your frustrating like get me someone, get me someone. I’m one of those people who like no pages pressed zero all the time. I’m like, get me a human. But it makes sense. actually once you can match the talent with like the issue, it can be a lot more efficient in solving the issue and the customer experience will be much better as well. On the AI agent side. What’s the kind of, I guess, consensus right now? Like, are they really actually good at solving issues? Are customers complaining about them? Like, ⁓ how sophisticated are they at this point? He’s like, in my day to day, you know, obviously calling the banks or DHL for pickup or package returns, whatnot. None of those agents are really a pleasant experience, frankly. John Wang (12:35) Yeah, I think this depends pretty drastically on what tools you give these agents access to. I would say that the standard experience right now is fine. It will answer knowledge questions for you. And these can solve anywhere from 30 to 60 % of incoming issues, depending on how many knowledge questions you get. the place where it really is important is when you actually give it access to say your backend database and you can like make a refund or you can look up in order or you can identify why is my what’s going on with this error, right? And that is actually the hard part that prevents most of these banks and airlines and etc agents from being very good is because like that Access to data is a thing that they need to actually run and be able to perform actions. And then also the evals for that are really, really hard and not something that you just like launch without really thinking about it. So I’d say it’s in a progressive, like it’s in a progressing state, not at a place where it’s like, this is absolutely solved, but there are also some of our customers who have 90, 95 % of all issues who are able to be completely automated. because they’ve spent the time to give access to all of these systems and spent the time to validate that the agents are performing. Grace Shao (14:00) Very interesting. ⁓ I want to pivot into the to be kind of angle. Who are you guys actually selling to? Like who are the people inside companies that are managing this? Is it the head of support operations? And when they are buying and assessing your product like yours, is it really winning on price? Is it like over, you know, other maybe large softwares? Is it winning on speed and service? Like help us understand essentially how you guys are succeeding winning over customers. John Wang (14:26) Yeah, we generally sell into the head of support. Sometimes that person rolls up into the COO or there’s a head of operations or something like that. But generally there’s some group that is working on unsupport related things and that’s who we sell into usually. I think generally our differentiated, like the way that we actually go and sell this is one, we know all about workforce management, which is like a really, really nitty gritty detail about how you make your systems really good. And it can save you millions and millions of dollars. Almost actually, and this is one of the things that’s really funny, it’s like using our AI agents versus using our workforce management, we actually see somewhat similar gains across those two. Because to use the AI agents, you’re usually doing it so that you can reduce head count, right? And in order to reduce headcount, you need to know how much can I reduce headcount without hurting my customer experience. And for that, you generally need something like Workforce Management. So what we do is we go in, usually we have Workforce Management helping you understand how your system is set up. And then what the AI agents that we can also bring in is a relatively easy sell because our AI agents are really, really connected to kind of like how you staff and when you pull in people. The thing that you were mentioning, which is like, hey, my bank still doesn’t have a very good experience, that’s true of a lot of places. And getting access to information is really hard. So escalating to a human actually happens pretty frequently, sometimes 20, 30, 40 % of the time. So getting to the right human or the right or figuring out when to escalate to the right human is a really, really important skill to have. If you spend a million dollars a year with me, I should escalate you much more quickly than if you are a free user and you haven’t spent any money for me ever. Similarly, depending on the topic, depending on what kinds of things you’ve already previously talked to me about, I should be able to get to different types of agents and I should be able to have different levels of thresholds. that send me to a human. And I think because we have and handle the workforce management side, our ability to do the handle time and to make sure that you’re getting to the right person is much, better than a lot of our competitors. Grace Shao (16:45) So there’s an unspoken tier system then I guess with customer service as well that we don’t realize. How should we think about the economic importance of support operations? In terms of, we always think of it as like we said, back office support, but how much, do you have any proof that like basically better customer service equals better revenue? John Wang (16:52) There is, and sometimes it’s spoken. But yeah. You know, that’s a good question. should probably have some specific proof here. I guess the best anecdotes I can find are usually the kind of like medium to long-term anecdotes where companies that do not invest in their customer support tend to, you know, regress to the meat, right? Like if you are really trying to bare bones your way through customer support, ⁓ Your customers will understand that and it’s not going to affect your revenue right now, but it will likely affect your revenue in 6, 12, 18 months the next time that purchase happens. have seen actually some of our customers, so in our AI agents, we have a configurable setting that’s like, do you want to be containment focused or do you want to be escalation focused? And how good of a customer experience do you want? And we’ve generally seen actually that there’s a strong correlation. Obviously we haven’t run a ⁓ natural experiment or a true A-B test with this because it’s pretty impossible. But you see a general correlation between the customers that spend more money on support, the customers that spend more on trying to have a high quality experience, and the revenue growth of those companies. actually most of the customers that we spend a lot of money on care so much about support that they actually have, you know, executive briefings every week about these, about what’s going on. And they’re the people who have the largest support teams and they’re the people who kind of like make the most, make the most changes with their team. Obviously this is a very biased perspective from our customer, like set of customers, but I think that there’s still something to that where if you spend money and if you want to make a your support really, really good, that does tend to pay off with customers because they do tend to notice and it makes it easier from a product perspective to paper over all of the things that aren’t so great. Grace Shao (19:06) Yeah, no, totally makes sense. think even as consumers ourselves, we would be likely turned off by certain brands or experiences if the customer service really bad, right? Unless, like you said, they’re a monopoly and there’s nowhere you can go. All right, let’s talk about AI. You kind of touched on that earlier, but the naive view is that AI automate support, you know, a lot, a lot of the conversations right now about, my God, jobs are going be taken, especially the first batch is probably in roles like operationals and customer support roles. Second batch, people are saying are maybe in like more repetitive execution roles like junior consulting roles, a lot of junior training up roles, right? How do you see that? Because at least from where I sit in Hong Kong, a lot of stories are coming out saying markets like India, the Philippines, know, across Southeast Asia where they traditionally served as those telephone call centers or operational centers, they are getting caught. Is that going to be a trend forward? you know, how should we understand this? John Wang (20:02) Yeah. Yeah. I think there’s a few things. There’s like, like with all things, there’s a lot of nuance to this, which is I think your trend on seeing, you know, what we call tier one support, the first line of support who are traditionally humans outsourced. That is a place where we’re seeing a lot of change. And I don’t think that trend is going to slow down. That said, there’s a very interesting other trend that we’re seeing, which is that total spend on humans and headcount isn’t necessarily going down by that much. And it’s kind of like Jevin’s paradox where we see a lot of our customers and a lot of customers of other AI users ⁓ who have amazing resolution rates. They’re like answering so many questions, but that’s causing actually, or maybe there’s some correlation here of the number of tickets they’re getting and the number of chats they’re getting is like way, way, way higher than before. And I think there’s a few parts to this. One is you see way more ability for your AI agents to answer questions. And so obviously people are going to ask more questions because like, Hey, it used to be really hard for me because I had to literally type out an email to a human, wait a few days and get an answer. And now I can just like get an instant really good answer. Right. So I’m going to try asking more questions. The second thing is, as these companies do better and better, you actually just have this natural induced demand of increasing usage, numbers of people who are asking for support. So the higher amount of support that is automated is also, the general number of how much support is coming in is also very high. And so that actually offsets a very, very large portion of the head count. The head count is changing though. It’s not going to be the typical tier one support where it’s just like, answer an easy question. That is mostly going to go away to AI, think. The types of head count that is coming in are like, know, internal agents, people who are really good, people who can provide white glove support and like... actually go talk to people and provide like a human experience because like our companies still want and really crave giving that experience to people. And that’s just not what the kind of BPO standard really is. So I think it’s changing in the type of what you would see. Grace Shao (22:31) Yeah, I was actually going to ask about that, like, as in when AI agents start resolving more tickets, if we’re just going to see reduction of headcount. And I think you answered him when you once wait, whereas like, yes, in initial stages, but later on, there will be new jobs created, right? Essentially, people who will be managing more critical issues or even managing the agents. I want to understand. So for your company right now, essentially, are you a are you like a middleman between the human agents and the AI agents and becoming the orchestration layer, like you’re providing the service, the training and the orchestration. Like, how do we understand that? John Wang (23:05) Yeah, that’s a great question. So we think of ourselves as how do we get you to the right way to answer your question, right? In our view, there’s kind of three main types of people that can answer a support question. One, it’s AI. Two, it’s a tier one BPO’d outsourced agent. And three, it’s an internal agent who’s like super well-trained and like super, like, you really carrying about, like really trained on customer support. And what we are trying to do is make sure that you get placed at the right area, depending on what kind of issue you have and who you’re talking to and like what is the kind of like a cue that is backing up the set of people who need calls. So for us, what we’re trying to do is really provides you that ability to choose across a bunch of different options. So we don’t actually provide any, like we don’t provide any BPO agents, we don’t provide any internal agents. All we do is provide the software that routes you. And we also provide the software that can do the AI agents, or you can actually plug into a different piece of software if you want to have your own AI agents too. We’re trying to make sure that we are kind of third party and that we are making it really easy for you to optimize your support regardless of what specific providers you use. Grace Shao (24:30) So in your view, what does a well-run support organization look like in, let’s say, three years as AI adoption becomes mainstream or more more mass market? John Wang (24:38) I think you’ll probably want to have all of the different types of support using AI. So voice AI, chat AI, email AI. I think you’ll want to have a lot of nuance between the different types of customers that you have. You can’t generally provide the best level of support for literally everyone. Though this depends on also your customer base, right? Like a consumer customer base versus a super enterprise customer base with 100 very large customers is completely different. But let’s say for a standard company that might have ACVs that are in the, I don’t know, the 100 to couple tens of thousands range, then you’re probably going to have a combination of AI agents and human support. And you might have different tiers of human support, right? Some human support that’s really good at answering support questions and other tiers of human support, which is like, you’re just managing the agent. I think the other thing that’ll happen a lot is you’re gonna start to see more like, supporting agents acting in a simulation where right now, like the kind of typical flow is like a supporting agent gets a ticket and they answer it and it goes back. I think as the agents get more like, get more and more training data, get access to more information, really they’re only gonna come to humans for escalations. And similar to how Waymo works, if you’ve ever taken a Waymo, it’s a great experience, you’re like driving, driving, driving, and sometimes you get kicked out and a human operator in the Philippines is like, hey, I need to move you around this truck, right? And similar to support, That’s probably what’s going to happen. A human operator is going to come in and be like, hey, I can give you a refund right here. And then what’s going to happen is the AI agents are going to train on that, right? They’re going to like learn and get better. And you’re going to be able to use that whenever you have an interruption to understand like, why did I have this interruption? How do I make my model better for the future? And then you’ve got your closed loop. So I think in the future, you’re going to see much more of that happening than people who are just like, coming in and their job is to solve as many tickets as possible. I think the change is gonna be like, okay, people are gonna start to need to provide the best possible response in that particular instance so that the models can train on that and be as good as you are. Grace Shao (26:58) actually just on that, do you think then we’ll see more and more fraudulent activity or people trying to exploit that? like if say you know the models trained on, I say this one buzzword or one keyword and it triggers like refund. What if I just go on the call like on the phone all the time, just to be like keyword, keyword, you know, and then like how do we prevent something like that? Or do you guys kind of get involved in that building this guardrails as well? John Wang (27:21) Yeah, no, that is a age old question. think like, wouldn’t say there is going to be necessarily more or less of that, but I think like, it’s kind of like the cat and mouse game of like, everyone has always been doing that. And so like, and the methods always change every, every few months. I think the methods will change every few months here too. Our AI agents have a lot of guardrails put in place to automatically detect that. And we also have kind of like post-hoc guardrails which are like scanning through our logs and trying to identify situations where that might have happened. And we’re also training on those examples, right? So I think, yes, people will definitely start to exploit this and be like, hey, how do I get a refund faster? But there’s a ton of guardrails that you can put in place. For example, each account, you can have one or two, have like refunds without looking until that actually gets flagged and it needs to go to a human or. You can set good policies, for example, like, you know, if it is within policy of 30 to 40 days after purchase, like automatic refund, otherwise, you know, flag it and do something with it. So there’s a ton of stuff that you can do to actually like reduce the possibility of that. And I do think that it will end up being cat and mouse game like over and over again, as people get more sophisticated. Grace Shao (28:39) Right, right. And they’ll start using AI to trick AI. That’s what’s scary, right? So as we talked about, different gender standing the podcast does not have to interview anyone related to China or Asia, but we do have kind of an Asia angle to a lot of how we view the world. So my question for you really is because you’re like out in San Fran and like your company actually has no sales in China or anything. But I actually had a curious question. How does SF John Wang (28:43) Totally, yes. Grace Shao (29:05) as a whole, the startup ecosystem kind of view the current rise of a lot of Chinese AI. And have you guys yourself or your peers, you know, tried to use Chinese open source models over the years? Is there any view on the open source models given that, you know, you previously said you were very involved in open source and I think it’s part of your philosophical belief as well, right? So just kind of like the high level vibes. John Wang (29:28) Yeah, our vibes might be different than at the model, like the Frontier Labs, honestly. Our vibes, we love the Chinese open source models because it adds more competition. And I think the open source models are actually very, very good. I think from my friends at OpenAI Anthropic, they don’t like it quite as much because it’s competition. But for us, we have no allegiance really to any of the Frontier Labs. or any of the models that are out there, we want to provide the best possible experience to our customers at the best possible price. And that has meant, you know, over the years, like making changes in our models, making updates and to figure out what is that frontier of cost or performance. The Chinese models tend to perform really, really well on that, especially Grace Shao (30:12) Mm-hmm. John Wang (30:19) kind of like the latest series of models, we’ve actually spent a lot of time in the last six months kind of like pulling out a lot of our tokens. We have tens of billions of tokens per day. And a lot of it now goes to models like Quen or Kimi. And like that has actually started to really, really increase over time, mostly because you can find to them, you can do RL on them, you can... have better latency on them, you can run them on your own hardware. There’s just like so much more stuff that you can do with it. And also, you know, the cost performance latency trade off is really, really good. Now, most of our most of the like the strategy we take is actually one where we try to understand the use case and the problem and what type of model is necessary for that. So for kind of like the main model that’s actually answering questions. We’re actually usually using a frontier model for that. But actually the majority of our tokens come from out of secondary processes, processes like detecting if I need to escalate, detecting if there’s a fraud here, detecting if there’s an adversarial intent, making updates to large swathes of data in batch, like all this other stuff where you really don’t need frontier level intelligence and where if you have a a well-tuned prompt and an open source model or an open source model plus a fine-tuned model, you can get at or better in terms of frontier performance. We’ve really seen that and we’ve actually been able to save millions on our token costs in just the last two or three months by being very smart about how we use our models. And we’ve also seen a 15 to 20 % increase in quality. ⁓ Just because like when you go and you have evals, you can make things much, much better more quickly with these open source models. Grace Shao (32:14) Yeah, I think that’s like the general sense I kind of get from a lot of startups, right? In a known day, it’s like, you guys are obviously more cost conscious. What is the best price to get to what you need? And there’s like a tier system where how you use the models, you might not use the most frontier models for everything. I think that makes a lot of sense, business sense, especially. Is there anything you would like to share with us that we haven’t touched on in terms of, just the overall AI ecosystem, any thoughts on, you know, where we’re going with this AI agentic push right now? ⁓ you know, are we really going to see that, you a giant moment, like just kind of some high level thoughts. John Wang (32:49) Yeah, that’s a good question. Recently, I’ve been thinking a lot about Opus 4.7, which got launched a few days ago. And it’s actually kind of similar to what we were just talking about in terms of this price for performance ratio. And it seems like, based on my usage, based on our evals, based on other people’s usage on the coding side, that it’s a better model, but it is also more expensive. than before. like, you’re really it’s literally like a trade off in terms of dollars and intelligence. And it’s really interesting because, you know, a year ago, every model would just be like, this is strictly better, and it’s probably cheaper, and you’re to get more context and like, everything’s better. And you could basically just bet that you’re just going to like get better models across the board. And now actually, you’re just like kind of moving from this part of the like the frontier curve to the other part of the frontier curve without actually shifting the entire curve. And that’s happening with a few more model releases. You still see general increases in the frontier, but it’s less stark every single model release that you see that. And so I think it’s just an interesting area to look at because when you get into that world. Gross margins has become really important. Gross margins for ourselves as a startup, but also gross margins for Anthropic and OpenAI. One of the funny things that I’ve seen, just talking to people who are working at Anthropic and OpenAI, and also people who are trying to invest in those companies, gross margins are actually incredibly important. One of the OpenAI right now is becoming a much more... hand investment than before. And like, it used to be like six months ago, it’s like, you have it, you have shares of OpenAI, like, how do I get in? Now it’s completely different with, you have shares of Anthropic, how do I get in? And I think part of that’s because like, OpenAI wants to spend $100 billion on infrastructure. And Anthropic is a lot more measured in the way that they’re spending money. And I think gross margins actually do matter a lot right now. And that’s where I think actually Chinese open source models are making a big difference because just at the end of the day, you still have to make money. And if you’re losing money on a per token basis, that’s really bad because if you go to infinity, you lose infinity money. And if you make money per token, great. Ramp usage up as high as you can. Grace Shao (35:03) Yeah. It’s just so crazy how the sentiment shifts like so every three months I feel like and then to your point like whenever I speak to investors like oh my god I got my hands on some anthropic shares and last three months earlier. Oh my god I got my hands on opening I like it’s just like and like oh no one would invest in opening right now like I don’t want to do that like people just completely go like black and white on these things it’s pretty crazy how the pendulum swings I do have a question actually on the infrastructure side doesn’t it actually make sense for open AI to eventually own their infrastructure because otherwise they have to become continuously constantly pay the hyperscalers for all the infrastructure like so in the grand like scheme wouldn’t it make sense? I mean although obviously how much you’re spending is like absolutely crazy. John Wang (35:55) I think it actually does. And I think that’s like part of the problem, which is like, you know, if you think about what their compute costs are, I think actually doing all of these things that they were doing makes perfect sense. And it makes especially perfect sense if you have investors who are willing to bankroll this. But it’s almost like the, ⁓ what’s that paradox? It’s like the St. Petersburg paradox, something like that, where it’s like, you keep doing, your expected value is infinity and you keep doubling your money basically, but at some point you need to not double your money because you don’t have enough money. Grace Shao (36:32) That’s such a mo- I’m like, I’m still confused when you’re saying, go back. You keep on doubling your money. John Wang (36:36) Sorry, So I think the I think it’s like Let me let me look this up st. Petersburg paradox is Okay, it’s a coin flipping game and You start at two dollars and with every tails you double the pot and you can basically decide to like take your money at any time, right? and so you you’re doubling exponentially as you go up and If you compute the expected value, you should basically just like, keep going forever because your expected value is like infinite, right? Like because the doubling of the pot is better than kind of like what your losses are. You just got to, you got to run. And I think OpenAI is in this St. Petersburg paradox where it’s like, well, in theory, double everything, keep going. But in practice, you don’t have enough money. Grace Shao (37:17) Yeah, I see what you mean. John Wang (37:25) and resources to be able to do that. I think that’s actually what’s happening is like, there’s not enough money in the world, not enough investors with liquid cash who are willing to invest in a business as big as OpenAI while the gains and the returns are still, yeah, having improvements. So I think it’s both rational, but also, you know, actually practically very hard to make what they’re doing. Grace Shao (37:41) haven’t been proven. Yeah. totally. Okay, I want to ask you one last question, which I ask every single guest. What is one differentiated view you have? It could be on your own sector, industry, life. John Wang (38:00) man, have a really like, I have one that like is very controversial. I don’t know if I should talk through that one. ⁓ Grace Shao (38:07) You’re get doxxed and to hate it after this. Okay, tell me that one after, I wanna hear it. John Wang (38:17) Yeah, yeah, Let’s see. Like... I would say, I don’t know if this is differentiated now in the market or not, but the thing that I’ve been thinking about recently is that you should stay somewhere long enough where you see your mistakes through. And I think it’s like slightly differentiated right now, because like you’ve got in Silicon Valley, at least you’ve got people who are jumping between big labs, who are jumping between different startups where it’s like, Hey, I can make the next, you know, $5 million. by going to this next thing. And there’s just a whole huge amount of opportunity and there’s like a ton of opportunity costs to staying somewhere for a long time. And at the same time, think like long-term staying somewhere for a long time is actually one of the best things that you can do for your own learning. And it gives you that a better shot to make like the long-term massive gains that you could have like $5 million. is amazing for someone. But if you want to build your own startup, if you really want to like change everything, if you jump around between companies every year or two, like you’re probably not actually going to learn a lot. And you’re probably not in the position to make really hard decisions and then have to see those hard decisions through and then, you know, be able to learn and see that feedback from those hard decisions. Especially if you’re jumped like Especially if you’re like at OpenAI and you’re like, no, investors don’t want this anymore. You jump ship to Anthropic. That’s like, you know, I don’t think you’re going to get that, the learning that you really need to get. Grace Shao (39:46) Yeah, yeah. actually agree with that. think I also took some time and like experience to realize that because when we’re all young, like you’re really excited, right? It’s like, this looks cool. That looks cool. this person hates me. hate that person. Like you take everything very personally and then, you know, we’ve all heard these stories from peers, even ourselves. But what is the threshold though? Because then the other side of the argument is that like you see people who’ve been in a job for like a decade and clearly they’re frankly not. moving up in a very corporate structure way or even intellectually growing or even, you know, excited about their job anymore. You know, the joke is like you get the like, okay, this sounds on PC, but you know, like the 45 year old VP that’s been a VP for the last 15 years at banks, we have a lot of these. So what happens? Like when is it best for them to actually maybe jump or some say, that was like a lifestyle decision where they want to take it easy because they have some more time for kids. Fine. But taking that kind of considerate way, wouldn’t it sometimes be better that you jump to try something new to take risks? John Wang (40:50) I think if you are in a place where you’re unhappy with, so I will caveat this with, have to, you should only stay if you’re excited about what you’re doing and you’re learning continuously and you’re surrounded by great people. If those three things aren’t true, yeah, it’s really hard to fly. Grace Shao (41:05) Which is so hard to find. you were very lucky at Stripe, right? Like you said, you were just surrounded by very high caliber, high agency people, but not everyone can get all those things at the same time. ⁓ But no, that’s great. Thank you so much, Sean. ⁓ I had a lovely time chatting with you. I still wanna follow up on what was the unspoken differentiative you later. All right, thank you. John Wang (41:17) Yeah. ⁓ Let’s do it. Let’s do it. Thanks, guys. Get full access to AI Proem at aiproem.substack.com/subscribe [https://aiproem.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]
28 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y forma parte de la comunidad de Differentiated Understanding!