Kansikuva näyttelystä Justified Posteriors

Justified Posteriors

Podcast by Seth Benzell and Andrey Fradkin

englanti

Talous & ura

14 vrk ilmainen kokeilu

Kokeilun jälkeen 7,99 € / kuukausi.Peru milloin tahansa.

  • Podimon podcastit
  • Lataa offline-käyttöön
Aloita maksutta

Lisää Justified Posteriors

Explorations into the economics of AI and innovation. Seth Benzell and Andrey Fradkin discuss academic papers and essays at the intersection of economics and technology. empiricrafting.substack.com

Kaikki jaksot

40 jaksot

jakson Ioana Marinescu on Insuring Workers for AI, Monopsony, and Philosophy kansikuva

Ioana Marinescu on Insuring Workers for AI, Monopsony, and Philosophy

This week we’re joined by Ioana Marinescu [https://marinescu.eu/], labor economist at the University of Pennsylvania’s School of Social Policy & Practice [https://sp2.upenn.edu/person/ioana-e-marinescu/], former Principal Economist at the U.S. Department of Justice Antitrust Division, and a member of Anthropic’s Economic Advisory Board [https://www.anthropic.com/economic-index]. Ioana is one of the people who put labor-market monopsony on the antitrust map, and she’s now thinking hard about what the social safety net should look like if AI hits the labor market the way the optimists (and the doomers) say it might. We start with her Digitalist Papers [https://www.digitalistpapers.com/vol2/marinescu] essay [https://www.digitalistpapers.com/vol2/marinescu], which proposes a flexible, two-tier toolkit: AI Adjustment Insurance (extended unemployment benefits + retraining + wage insurance, modeled on Trade Adjustment Assistance) for the churn scenario, and a scalable Digital Dividend — a broad-based cash transfer funded by a small tax on the digital sector — for the world where the jobs don’t come back. Along the way: whether to make policy now or wait, what counts as the “status quo,” moral hazard in mass unemployment, the TAA wage-insurance result that repaid its own subsidy, and Andrey’s “we can’t afford UBI” pushback. Then we get into her new model with Konrad Kording, (Artificial) Intelligence Saturation and the Future of Work” [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6051694]— why splitting the economy into an intelligence sector and a physical sector implies that output and wages saturate even as AI scales to infinity, the robots-vs-LLMs debate, and whether to just relabel “physical” as the non-automatable sector. We close with her DOJ years: defining monopsony, the transmigrante used-car collusion-and-murder case, the Penguin Random House–Simon & Schuster merger (yes, Stephen King testified), antitrust and AI, and a lightning round on ikigai, Camus, and Rawls vs. Mill. Links & References Ioana’s work * marinescu.eu [https://marinescu.eu/] — Ioana’s website · Penn SP2 faculty page [https://sp2.upenn.edu/person/ioana-e-marinescu/] * Ioana Marinescu, “Resilient by Design: Dual Safety Nets for Workers in the AI Economy” [https://www.digitalistpapers.com/vol2/marinescu] — The Digitalist Papers, Vol. 2: The Economics of Transformative AI (volume [https://www.digitalistpapers.com/volume2]) * Konrad Kording & Ioana Marinescu, “(Artificial) Intelligence Saturation and the Future of Work” [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6051694] — working paper (Brookings write-up & interactive tool [https://www.brookings.edu/articles/artificial-intelligence-saturation-and-the-future-of-work/]). The model finds wage growth can reverse once roughly a third of intelligence tasks are automated. * Ioana Marinescu, comments on Betsey Stevenson’s chapter [https://www.nber.org/system/files/chapters/c15320/c15320.pdf] — NBER, The Economics of Artificial Intelligence: An Agenda (the ikigai discussion) Concepts, papers & people discussed * Trade Adjustment Assistance (TAA) [https://en.wikipedia.org/wiki/Trade_Adjustment_Assistance] — the template for Ioana’s adjustment insurance; the wage-insurance component that got people back to work faster and was net fiscally positive * Betsey Stevenson, “Artificial Intelligence, Income, Employment, and Meaning” [https://www.nber.org/system/files/chapters/c15319/c15319.pdf] — the post-AGI meaning / ikigai argument Ioana was commenting on * “GPTs are GPTs” — Eloundou, Manning, Mishkin & Rock, GPTs are GPTs: An Early Look at the Labor Market Impact Potential of LLMs [https://arxiv.org/abs/2303.10130] — the occupational LLM-exposure measure (”Eloundou et al. / Daniel Rock”) correlated with COVID-era telework * Pascual Restrepo — job-market work on skill mismatch and structural unemployment during automation waves * Daron Acemoglu & Pascual Restrepo, “Robots and Jobs: Evidence from US Labor Markets” [https://www.journals.uchicago.edu/doi/10.1086/705716]. * Albert Camus, The Myth of Sisyphus [https://en.wikipedia.org/wiki/The_Myth_of_Sisyphus]; ikigai (Japanese: “reason for being”) * Baumol’s cost disease [https://en.wikipedia.org/wiki/Baumol_effect] * John Rawls and John Stuart Mill (Utilitarianism) Antitrust & the DOJ * The DOJ Antitrust Division, monopsony in the labor market, and the 2023 Merger Guidelines [https://www.justice.gov/atr/2023-merger-guidelines] * Judge blocks the Penguin Random House–Simon & Schuster merger [https://www.npr.org/2022/11/01/1133375227/federal-judge-blocks-penguin-random-house-from-buying-simon-schuster] (2022) on a labor theory of harm to authors — Stephen King testified for the government * The transmigrante [https://www.justice.gov/opa/pr/eight-individuals-plead-guilty-wide-ranging-scheme-monopolize-transmigrante-forwarding] used-car export case — collusion (and worse) in the US-to-Latin America used-car trade * Anthropic’s Economic Index [https://www.anthropic.com/economic-index] and Economic Advisory Board * Leopold Aschenbrenner’s Situational Awareness [https://situational-awareness.ai/] — the “we’ll have to nationalize it” argument referenced on consolidation Previously on Justified Posteriors * Our episode on the Anthropic Economic Index. Our sponsor * This episode is brought to you by Revelio Labs [https://www.reveliolabs.com/], the leading provider of labor-economics data, available to academics on WRDS [https://wrds-www.wharton.upenn.edu/]. Chapters * (00:00) Intro & sponsor * (00:47) The Digitalist Papers proposal: a flexible safety net for the AI labor shock — and why make policy now * (03:48) Why unemployment insurance isn’t enough, and the Trade Adjustment Assistance template * (05:51) What counts as the “status quo”? Banning AI vs. letting it run * (07:42) How much to insure: moral hazard, mass unemployment, and the three parts of AI Adjustment Insurance * (11:15) Skill mismatch (Restrepo), and how do you certify a layoff was “due to AI”? * (14:45) Did TAA buy social buy-in for free trade? Underfunding — and the wage insurance that repaid its own subsidy * (16:38) “Would Hillary be president?” General-equilibrium pushback and the ski-instructor problem * (19:28) Will the new jobs still be there in two years? The lump-of-labor fallacy * (22:09) Policy B: the Digital Dividend — unconditional, broad-based cash from a small digital-sector tax * (23:52) How to fund it: a sales tax, a sovereign-style fund, and deliberately slowing diffusion a little * (26:00) “We can’t afford UBI”: productivity growth, 0.5% vs. the deficit, and setting money aside ex ante * (30:47) Taxing digital goods: VPNs, evasion, and land-value taxes * (34:23) The motte-and-bailey worry, and the other reasons to like UBI * (36:05) The new model: (Artificial) Intelligence Saturation — intelligence vs. physical sectors, and the telework × AI-exposure correlation * (40:14) Gross complements: why output and wages saturate even with infinite intelligence * (42:23) Won’t enough intelligence just automate the physical world? Robots vs. LLMs * (45:52) “15% by 2030”: humanoid robots, cost, and bespoke vs. general-purpose machines * (47:58) Baumol, the “humanness sector,” and relabeling physical as the non-automatable sector * (48:52) The capital-share / profit-share puzzle: if they’re complements, why has the intelligence share risen? * (50:25) The DOJ years: monopsony, and what the Antitrust Division actually does (mid-roll sponsor at 51:29) * (54:52) “Assassinating rival CEOs”: the transmigrante collusion-and-murder case * (58:12) Favorite cases: Stephen King, the publisher merger, and the chicken-farmer monopsony settlement * (1:01:30) Antitrust and AI: foundation models, consolidation, and the natural-monopoly question * (1:06:05) Slowing AI by allowing market power; Leopold, nationalization, and diminishing returns vs. the singularity * (1:09:27) Substitutability, the AK economy, and short-run vs. long-run wages * (1:10:59) Lightning round: ikigai, Camus, and the myth of Sisyphus * (1:12:44) Can we build market-like mechanisms for ikigai? Loneliness and coordination costs * (1:14:13) The Anthropic Economic Advisory Board and the Economic Index * (1:15:21) What’s next: monopsony and industrial policy * (1:17:59) Favorite philosopher: Rawls vs. John Stuart Mill * (1:19:45) Sign-off Justified Posteriors is the podcast that updates its beliefs about the economics of AI and technology, hosted by Andrey Fradkin [https://www.andreyfradkin.com/] and Seth Benzell [https://www.sethbenzell.com/]. If we changed your priors, subscribe [https://empiricrafting.substack.com/], share it with a friend, and keep your posteriors justified. Intro & Sponsor [00:00 – 00:47] [00:00:06] Seth: Welcome to Justified Posteriors, the podcast that updates beliefs about the economics of AI and technology. I’m Seth Benzell, excited to learn about what AI is other than what my bubbe says after I spill hot water on her, coming to you from Chapman University in sunny Southern California. Andrey: And I’m Andrey Fradkin, coming to you from San Francisco, California. We’re very thankful to our sponsors at Revelio Labs, purveyors of fine data products. And we’re very excited to have Ioana Marinescu join us today. Welcome to the show, Ioana. Ioana: Thank you. I’m so glad to be here. Make Policy Now: A Flexible Safety Net [00:47 – 05:51] [00:00:47] Andrey: To get started — you have this very provocative, interesting piece in the Digitalist Papers about various social policy solutions for transformative AI scenarios. Could you tell us about the piece? Ioana: Absolutely. As part of doing this Digitalist piece, I was thinking, as somebody who has worked a lot on the social safety net: what do we do if AI leads to a lot of job loss, like many people are saying it would? We’ll talk later about the various scenarios, but assuming that’s at least a possibility we have to acknowledge, what would you want to have from a policy perspective? And so I was really thinking hard about devising a flexible policy toolkit that will be able to address issues in the labor market no matter how big the shock is. That was the overarching theme of the policy design I’m proposing — just to start a discussion. I’ve tried to propose some helpful options, but it’s really with the idea of, let’s talk about doing something like this, what are the pros and cons. [00:02:10] Andrey: So what are the options on the menu for — let’s say AI comes along, a lot of people lose their jobs. The first thing we should get started with: do you think we should be making policy today, or should we wait until something happens and then make policy? Ioana: I think it’s very important to make policy today, but in a flexible way — meaning the policy cannot depend on some very specific detail of exactly how AI is going to impact the labor market, because we don’t know exactly what’s going to happen. It’s important to put the policy in place today because the political process is very long, so it may not be able to come online quickly enough when we really need it. That’s one reason. The other is — and I work a lot on social insurance — for workers, they want to and should feel insured. “Whatever happens, we the government have got you covered.” If we don’t have that, and we’re just waiting for bad stuff to happen, that defeats the purpose of having a social safety net. That’s a core reason I think it’s good to have something in place sooner rather than later, even before all the effects of AI on the labor market have materialized. Seth: Something that will automatically kick in. Ioana: Exactly. Andrey: And why is — we do have some programs like that, like unemployment insurance. Why is unemployment insurance not enough in its current form? Ioana: Unemployment insurance is incredibly valuable, but if we have a big shock like AI, it’s going to affect a lot of people who will not necessarily lose their job forever, but simply have to change jobs — and that’s very costly. The whole purpose of the social safety net is to help people through those transitions. The thing is, we have AI, and the way it’s being deployed is a policy choice. We could say we’re going to try to stop AI, but we’re not doing that — and I’m not saying we should or shouldn’t, just that it’s a policy choice. We’re saying we’re not going to stop AI, we’re going to let it be. But then some people are going to get hurt, at least in the short run, and we need to do something so those people have something to fall back on. Just like with trade: we decided to have free trade, we knew some people were going to get hurt and lose their wages, and we put in place policies like trade adjustment assistance — which inspired some of my proposals — to make sure the policy we’d chosen wouldn’t leave people on the side of the road. That policy hasn’t completely worked, because it was underfunded, but the big point is: the technology is exciting and has a lot of benefits, we’ve decided to deploy it quickly, and some people are going to bear a cost. We just want to make sure we help those people. What Counts as the “Status Quo”? [05:51 – 07:42] [00:05:51] Seth: I’m really excited to hear your specific ideas, but I’m curious about this framing of what counts as the status quo and what counts as the policy shock. In the case of trade, you might say the status quo is protectionism and the policy shock is allowing trade — so it makes sense to frame the social insurance relative to that: you shouldn’t be worse off relative to introducing trade. But with AI it’s not obvious. It seems like the policy shock would be to ban AI — AI would happen without the shock. So why not say, “If you banned AI, there should be social insurance to help the people who would have been better off if we’d allowed AI to go full throttle”? How do you think about what the status quo is here? Ioana: I don’t know that the status quo is necessarily the correct reference point — that’s something you could debate. My point is rather that a lot of these technologies have a lot of ramifications, and there’s a decision we make about how we want to control it. Collectively, we’ve made the decision that we’re not going to try to control it too much in its effect on the labor market. And therefore we need to deal with the consequences of helping people who might get hurt, at least in the short run — even though, hopefully, in the long run it will be great for everybody, including them. So we can reassure people that it’s going to be fine. That’s part of the goal of the policy. AI Adjustment Insurance: Moral Hazard and Three Components [07:42 – 14:45] [00:07:42] Andrey: How do you know how much to insure? Full insurance would be very expensive, but it would also create a lot of moral hazard — we do want people making choices in anticipation of AI. If everyone just puts their head in the sand and pretends it’s not happening... Seth: “The automation insurance is too good. I want to get automated — give me that automation insurance.” Ioana: So the insurance is going to be incomplete. And I’ll talk in a moment about what I’m proposing, which is modeled on trade adjustment assistance. But it’s also important to know that some of the moral hazard issues in unemployment insurance — which is something I’ve studied a lot — are much less economically important during times of high unemployment. If what happens is huge amounts of unemployment — not necessarily forever, but a lot of people needing to change jobs — then typically there are too many people looking for jobs relative to the number of vacancies. In that case, the fact that some people might put in less effort to find a job, sending fewer applications, is actually fine, because collectively they’re sending a lot of applications. They’re shooting themselves in the foot by competing too aggressively for the limited number of jobs. [00:09:25] Ioana: So even if more generous unemployment insurance seems like it’s desensitizing people from looking hard for a job, the end effect doesn’t necessarily reduce the number of jobs found, because there are too many unemployed people relative to jobs. My work has shown that in prior situations like COVID. So in a situation like that, we should be far less worried about the disincentive effects of more generous unemployment benefits. And maybe now I’ll come to the policy — I call it AI Adjustment Insurance. It includes more generous unemployment benefits, meaning they last longer (again modeled on trade adjustment assistance); additional training; and a third component, wage insurance. Wage insurance means that if you find a job with a lower wage than your prior job, the policy covers part of that gap. That actually encourages you to take a new job even if the wage is a little lower — so it’s directly pro-reallocation. The training and wage insurance push reallocation, which counteracts the concern that longer unemployment benefits might discourage job search. Seth: Maybe one of your sources is Pascual Restrepo’s job-market paper — this idea that during an automation-driven unemployment wave you get a skill mismatch. Lots of people are applying, but they have the wrong skills, and that increases friction in the market. From an efficiency standpoint, it might not be the worst thing if some of the people with out-of-date skills aren’t looking for jobs. Ioana: Exactly right. That would increase the friction, and this policy, if well-implemented, has the ability to manage that friction better. [00:11:52] Seth: There are many reasons a person might lose their job. They could be losing it because they’re doing a bad job, or because of macroeconomic things that have nothing to do with AI — or it could literally be AI. Recently Coinbase claimed to fire a bunch of people because of AI, and a cynic might say their stock price was going down and crypto was struggling. Do you care to identify that? Is that an important part of the policy? Ioana: Of course you need to identify it, and there are going to be inclusion and exclusion errors — it’s not foolproof. Some people who should be eligible won’t be deemed eligible, and vice versa. But I believe we can put in place a process with reasonable accuracy. That was the case with trade adjustment assistance: the company had to certify that the job was lost due to trade. You can question it, but there was a process. In the case of AI, similarly — and I haven’t fully thought this through; if someone wants to do it, let’s do it. People in these offices know a lot about how to do this. An example: if you’re in an occupation that’s highly exposed, and there’s recently been investment in AI at the firm — buying new software that uses AI to do business services — that might plausibly amount to a layoff due to AI. It won’t be 100% accurate. You could do it at a micro level, trying to figure out whether this particular job got automated, or you could do a macro counterfactual simulation — in a different universe there would have been 30,000 more taxi-driver jobs, so you attribute some percentage of that to your loss. But you can’t do that if you need to decide right now whether this person gets the service. That’s interesting from a research perspective, but operationally we’d have to determine eligibility — maybe just being in an exposed occupation, though that might be too crude. Did Trade Adjustment Assistance Work? [14:45 – 19:28] [00:14:45] Seth: Before we move to your other policy idea — this trade adjustment policy that was supposed to get big societal buy-in for free trade was a glowing success, right? Everybody loves free trade... Ioana: The policy didn’t do well because it was underfunded. The amount of funding is a strict cap decided by Congress, so they just couldn’t spend more. The number of people who received it at all is very small relative to the number exposed to trade. However, for those who did receive it — and remember, I’d argue it wasn’t enough — it worked well. There’s a really cool paper looking at the wage-insurance dimension. So: I lost my job due to trade, I take a new job that pays less, and the wage insurance covers part of that gap for two years. What happened, which is fascinating, is that this component helped people return to work quicker — and it was fiscally net beneficial from the government’s point of view, because they returned to jobs that were no lower-paid than they’d otherwise have taken, started paying payroll taxes again, and essentially repaid their own subsidy over time. So at least based on that experience, it’s a highly effective way to support people through the transition. While the scale of TAA was too small, for those who got it, they benefited a good bit, and it was effective even from a fiscal perspective. Seth: If TAA was more generous, would Hillary Clinton be president? Ioana: Who knows? But I’m going to push back on the argument. Earlier you were making the case that with unemployment insurance, moral hazard isn’t an issue because in general equilibrium there aren’t enough jobs. But if we expanded the size of TAA, the general-equilibrium effects could also swamp the benefits — maybe only a small share of those people could effectively have found jobs, and if you gave the benefits to many more of them, they wouldn’t be able to take advantage. Seth: You mean because the fiscal cost would become significant? Ioana: No — they wouldn’t be able to find the jobs. Seth: Right. If a small number of people get wage insurance and there are some jobs they can take, they take them. But if you give wage insurance to everyone, many wouldn’t be able to find jobs — or they’d cannibalize jobs from people who would have gotten them anyway. Say I worked at a factory, and now I decide to become a ski instructor, and you give me wage insurance for that. There have to be equilibrium effects. Ioana: This is close to my heart, because I’m a big skier. For sure this increases competition for jobs wherever people decide to go. But from a micro perspective, wage insurance helps because people are now willing to expand to jobs that pay a little less. And mind you, it’s only for two years, so you need some commitment that the job is reasonable — and by that time you can increase your wage through returns to experience. It’s similar to research on job-search assistance: you put people on benefits and help them — or even require them — to apply to more jobs. What the research shows is that if you do that for a lot of people within a given labor market, it stops being effective, because they’re trampling on each other’s toes. Will the Jobs Come Back? The Lump-of-Labor Fallacy [19:28 – 22:09] [00:19:28] Andrey: Is there an underlying assumption in your proposal that the occupations people move into won’t go away within those two years? This is one of the big challenges — we have enormous uncertainty about exactly which labor markets are going to be affected negatively, and maybe some positively. What do we do about that? Ioana: I don’t think there’s any guarantee that in two years the places they go will be safe. But if we still have the policy and AI continues to provoke churn, they still have it to rely on and can find another job. Also — and this is less true for non-economists, but a lot of economists imagine there’s a fixed number of jobs, so if we lose a lot, there are only so few left and everyone’s fighting over them. That’s not how it works. Everything is connected, and especially if the technology improves production, there are positive spillover effects that make other jobs more productive. So there will also be a lot of job creation. At the same time, we don’t know exactly what those jobs will be, and it could take several rounds of adjustment. I don’t want to rule out that it could be very bad, but I don’t think a massive net loss in the number of jobs is the most likely scenario. It could be very bad in the sense that a lot of people have to change jobs, which is difficult. But as someone who’s prudent and wants to be flexible — that’s what my second policy is designed to address. What if a lot of jobs are lost forever? Then we need something to fall back on. The Digital Dividend [22:09 – 26:00] [00:22:09] Andrey: Do you want to tell us about that one? Ioana: The second policy addresses a situation where there’s durable mass unemployment — not just people needing to find a different job while other sectors grow, but a lot of jobs disappearing forever, not replaced by new ones, so some people are structurally, permanently unemployed. What do we do with them? This is especially important in the US, where most of the social safety net depends on you either having a job or looking for one. Even food stamps: as a so-called able-bodied adult, you can’t get food stamps unless you have a job or are looking. If there are just no jobs for a large number of people, there’s not much to fall back on. So that’s what policy B — the Digital Dividend — is meant to address. The idea is a cash transfer that’s unconditional and broad-based. In the specific proposal I give it to everyone, but you could make it broad-based so a lot of people benefit. You might fund it with a tax on the digital sector. I don’t want to tax AI specifically, but all sectors that can immediately benefit from it — a broad-based tax, so it’s harder to avoid. Seth: A profit tax? A consumption tax? An income tax? Ioana: I was thinking a sales tax, just to make it easier — but this is something we can talk about. Andrey: No sales tax on GPUs, or...? Ioana: Just a sales tax on all digital companies. We can talk about other options — this is a beginning. The point is it would be very small, and the broader you make it, the smaller it can be while still creating revenue. You’d invest it in a fund, and the returns come back to people as cash. Why this structure? The tax side can slightly slow the diffusion of the technology — and there are theory papers showing that if there are labor-market frictions and credit constraints, reallocation is painful for workers, so it can be optimal to slow diffusion a little. We’re not banning anything. And the revenue lets you pay people the cash benefit if we end up in the no-jobs situation. This policy can be expanded — that’s the flexibility. You can start very small, almost zero tax, but if we have mass unemployment you scale it up and grow the base to the whole economy. Can We Afford UBI? [26:00 – 34:23] [00:26:00] Andrey: All of us have had conversations with technologists who jump straight to UBI as the solution to all issues with AI, and this is one version of it. What I always tell them is that we can’t afford it — and we deeply can’t afford it. There’s some future where productivity gains are so large that the numbers pencil out, but I’ve yet to see a tractable, plausible version. If we did it today, the amount per person would be trivial. And for UBI to truly work — as something that lets people not work — it needs to be a massive transfer. It can’t be even a thousand dollars a month. Ioana: I hear you. But where the technologists are consistent with themselves is that they often think AI is going to revolutionize productivity. If that’s true, then it will be possible to have a reasonably high UBI. And that’s the whole point of my proposal — it’s conditional, we scale it up as needed. I could even envision it not being completely universal, but it should be broad-based, so people have an income to fall back on if technology is deleting a lot of jobs. Andrey: Let’s say a plausible scenario that a lot of economists believe: AI increases per-capita GDP growth by about 0.5 percentage points per year relative to baseline. We’re five years down the road, and in the limit it’ll be big enough to support everyone — but we’re not there yet, and a lot of people are out of jobs by that point. We might not get the super-productive world until well after we have economic displacement massive enough that wage subsidies won’t work. Seth: And we need that 0.5 percentage points just to deal with the current deficits — we’re already counting on it. But here’s what’s natural to me: you start today and make plausible projections — what’s a world with another percentage point of GDP growth a year worth, what’s a world with two? — and you set aside a fraction of that in advance for a UBI or digital dividend. You make the policy now, rather than after the crazy thing happens. It has that automatic logic I really like. Ioana: That’s exactly the point. I’m a little concerned that after a lot of people lose their jobs and the situation looks grim, it might be more difficult to say, “Now let’s have a big reshuffling of money to help these people,” especially when some people have made a lot of money. Whereas if we can commit more ex ante to putting money aside, it’s a bit of a veil-of-ignorance situation — we don’t know for sure who the winners and losers will be. So it can be socially easier to agree to put a parachute in place now, before you know whether you’re a winner or a loser. It’s a political-economy argument, but I think it’s important, because I really worry we get there without enough to support people, and the winners say, “Ah, too bad.” [00:30:47] Seth: My questions are more on the tax side than the spending side. We’ve seen many efforts to tax digital goods, and they’ve had a lot of problems. Where was a Netflix video watched? Where was an ad viewed? People use VPNs; these companies have no physical locations and move easily. How convinced are you that we could actually raise significant revenue from a digital tax when my VPN is in the Cayman Islands, so I’m not paying a sales tax in America? Ioana: That definitely needs to be worked out. With tax policy you always have to think about incidence and evasion — not necessarily illegal evasion, just ways around it. I haven’t done a detailed implementation calculation, but it’s probably feasible to find a version that works. You won’t eliminate evasion — that’s always true with taxes — but you want to think ahead of the incentives. That’s why we’re economists. It’s not like you slap on a tax and the money comes in; there are behavioral adjustments you need to foresee with a coherent design. In principle, it should be possible to raise a good chunk of money if we wanted to. Seth: Related to that — it’s not crazy to think all the rents go to energy producers or even landowners of energy resources. And if we’re taxing them, we might disincentivize energy production, which raises the cost of living. Land-value tax solves all problems forever, of course — I do like land-value taxation. Ioana: In the long run you want to tax the inelastic input, and land is the ultimate inelastic input — that’s something to think about in the long run. But the reason a digital tax can make sense in the short run is this idea of a small, moderate slowdown — not massive — that has the benefit of accumulating some capital to help people with later. As I said in my piece, I’d definitely expand the tax base at some later point, once we think the transition has happened. The Motte-and-Bailey Worry [34:23 – 36:05] [00:34:23] Seth: Are you worried about a motte-and-bailey situation? Your proposal is very modest, but I could see politicians using that logic to implement a massive tax-and-transfer scheme today under the pretense that it’s about the AI future — and I really worry we can’t afford it. Ioana: That gets into the social welfare function — you could call it politics, or simply what we want — and as economists, it’s not really our job; as citizens we can have views. Different politicians legitimately have different ideas about what’s important. Within my piece, I was proposing the digital dividend as a solution to AI unemployment, but there are other reasons to like UBI. I’ve written about UBI before and discussed some of them. So maybe you also like UBI for those reasons — that’s a tenable view, and it’s also fine to disagree. There could be a push to go big right now, and if that’s ultimately what people want and they convince the rest of us, that’s just how the democratic process works. Intelligence Saturation: The Model [36:05 – 42:23] [00:36:05] Seth: Maybe this is a good time to transition into the new macroeconomic model you came out with, which is informing your beliefs about how radical the changes might be — “(Artificial) Intelligence Saturation and the Future of Work.” I love a title with parentheses in it. Just to lay it out for listeners: it’s a very neoclassical way of thinking about automation, but with new twists — a nested constant-elasticity-of-substitution model with an intelligence sector and a physical sector. Why is it important to think about an intelligence sector and a physical sector as complementary, rather than one thing? Ioana: It’s important to distinguish them because of an empirical fact — it’s not yet in the paper, but I’ll add it in the next version. If you look across occupations at which were most teleworked during COVID versus their exposure to AI — the Eloundou et al. measure of LLM exposure — there’s a very strong correlation. The more an occupation was teleworked during COVID, the higher its exposure to AI, and conversely. By “physical” I mean an in-person job, where you need a physical human body. That doesn’t necessarily mean working with your hands — teaching in person is physical in my definition but not manual. It’s just an in-person job. Seth: I was curious about those examples, because you gave the example of an in-person lawyer giving oral arguments — but didn’t we do those online during COVID? Ioana: Some of it we did online, but for the economics, the important thing is how substitutable these things are. Online teaching is its own thing — it has a place and a function, but it’s not the same as in-person teaching. They’re differentiated products; you can’t easily replace one with the other. So based on the fact that AI exposure and the ability to do remote work are highly correlated, that justifies the distinction between physical and intelligence. There are also fundamental limitations of the physical world that are much more stringent than the limitations you meet scaling the virtual world. It’s much more difficult to expand human bodies and physical capital — that’s very slow — whereas you can scale up AI incredibly fast. It’s still not costless — data centers and so on — but you can do a lot, really quickly. This distinction is critical to understanding how AI could affect the economy. [00:40:14] Seth: So you’ve got an intelligence sector growing really fast and a physical sector that maybe doesn’t grow as fast. Let’s roll with the assumption that the two are gross complements — you need both to have a lot of output; you can’t just have the peanut butter or the jelly. What are the conclusions of the model? Ioana: The core conclusion is that if physical and intelligence are complements, then AI can grow the intelligence side incredibly — to infinite intelligence — but as long as the physical sector stays fixed (or grows much slower), the impact of growing AI saturates on both output and wages. By saturating, I mean output goes to a finite limit, a ceiling. Even with infinite intelligence and infinite AI, output is strictly bounded. Wages increase too, because they go together with output, but they hit a ceiling. That’s what we call intelligence saturation. This is super important because a lot of technologists see the progress of AI and imagine the whole economy could expand at a similar rate. This makes the strong point that here’s a scenario I think is quite plausible: you could expand like crazy in AI and still only hit a ceiling in output. Robots vs. LLMs [42:23 – 48:52] [00:42:23] Andrey: I understand the thought experiment, but saying intelligence goes to infinity while the physical is a constraint is a little strange — it’s pretty clear that with enough intelligence we’ll figure out how to make robots work, even self-replicating, learning systems. Seth: There are two different parameters in the model, right? You’ve got the output of the intelligence sector, and then the automatability of the physical sector. So Andrey’s intuition is: if we had a gazillion intelligence, don’t we fully automate the physical sector? Ioana: I want to distinguish two things. One is whether humans can be replaced with robots. Robots are improving, but relatively slowly compared to AI. So comparatively, it’s much more advantageous to replace people on the AI-replaceable side than in applications where you need to be there in person. That’s not to say there’s no progress in robotics. Andrey: I’m just representing the technologist viewpoint — that this is true for now. Ioana: The argument I’m going to make is based on history — the history of technology specifically, and you can think history won’t be the same. We’ve had physical robots for the longest time. With the Industrial Revolution we created many more and improved them a lot, and even before LLMs they improved tremendously with machine learning and semi-autonomous systems. These are very real improvements, and they have replaced some jobs in manufacturing — Daron Acemoglu has papers on that. But it wasn’t like, “wow,” thanks to all that intelligence in the system, it still takes a lot to get there. I’m totally willing to think this new technology can make them even better — but I’m skeptical it gets a lot, lot, lot better. It’s the saturation argument: there’s a fundamental limit. The cost of robots is fundamental — you have to use materials to make them and maintain them. We’ve tried for centuries to improve robots, and they have improved, but I don’t know how much more you can improve them with this new technology. Andrey: A humanoid robot that’s smarter than a human seems like a pretty big improvement that’s plausible. Seth: 15% chance, according to a recent survey of economists and AI researchers. By 2030. Ioana: The question is partially the cost — not that it’s technically impossible, but what’s the cost of the whole thing relative to a human. At least in the medium run, I think it’s not very plausible you’ll bring that cost down a great deal. Also, a lot of these robots aren’t very versatile, unlike AI. That’s the cool thing with AI — it’s super general-purpose; it can use all the tools we had before. But most industrial robots are bespoke, meant for a particular application — that’s how you make them cheap and effective. I feel somewhat confident that in the short-to-medium run it will be very difficult to make it cost-effective to have robots replace people in most jobs. Harder to tell the further out you go. However, I believe everything you could do on a computer could get automated in the medium term — possibly even the research I’m doing. That’s a different matter, because it’s all based on computers. Andrey: Even if what I’m saying is true, the Baumol-style logic would still hold, right? If we still want human teachers even when robots are available... Seth: Maybe don’t call it the physical sector — call it the humanness sector. Ioana: You could relabel it. I think “physical” is relevant at least in the medium run, but to make it more future-proof, you could relabel the physical sector as the non-automatable sector, whatever that turns out to be. As long as there exists a non-automatable sector, that’s where people will work, and the mechanics of the model are identical. The Capital-Share Puzzle [48:52 – 50:35] [00:48:52] Seth: I love playing around with these neoclassical models — automatable part, non-automatable part. I’ve been doing it for a decade, and one challenge that pushes historically in a different direction: if intelligence and physical stuff are gross complements, you’d expect that as you get more intelligent stuff, its share of national income would go down. But over the last 50 years we’ve seen a huge explosion in education in the US, and yet the educated share of income has been going up. So how do I think about it actually looking more like gross substitutes? Ioana: But we’ve also had the share of capital going up. Part of the last 30 years or so is the ICT revolution, which is somewhat similar to the prior version — really you could say it’s the same thing, just different stages of an AI revolution that’s at an early stage. During that, the share of capital has been going up, and research suggests— Seth: Is it the share of capital, or the profit share? That’s an important question we’ll come to in a minute. Inside the DOJ Antitrust Division [50:35 – 58:12] [00:50:35] Andrey: You did this stint at the DOJ — you were on leave from being a professor. Could you tell us what led you to work there, and a bit about your work? Ioana: At the time I was doing work on monopsony power in the labor market — the difference between wages and the marginal productivity of labor. The Biden administration commissioned a report on labor-market monopsony, and the people doing that called me up; that’s how I learned about the job. I thought, “That sounds really interesting — to do antitrust enforcement as an economist.” I said yes, and I was lucky enough to get the job. [00:51:29] Seth: [Sponsor break] For those of you playing along at home, now is your chance to think about how this conversation has changed your priors. This chance to contemplate your posteriors is sponsored by Revelio Labs — a leading provider of labor-economics data and data services for companies, academics, and independent researchers. Revelio combines comprehensive micro-level data on employee profiles, job postings, and sentiment with standardizations, mappings, and enrichments, all to make the data useful without making your modeling decisions for you. It can be aggregated to company, market, or industry, and used to study everything from career trajectories to occupational transformation to the impact of AI on labor demand. Revelio data is available on WRDS — so if you’re an academic with a good library, go see if you have access already. And if not, reach out to their excellent economics team. [00:52:43] Ioana: It was an incredible experience. What I did there is — I was the principal economist— Andrey: Can you pause and define monopsony for our listeners? Ioana: Monopsony power is the idea that employers are able to pay workers less than their marginal productivity. Under perfect competition, the wage exactly equals the marginal productivity of labor — whatever value the worker brings, the company pays them for it. With monopsony power, workers get paid less than what they bring to the company. One of the key reasons is the lack of competition among employers. Intuitively, in the extreme — a literal monopsony, only one employer — that employer doesn’t need to pay much to keep you. Whereas if there are many employers, they bid up the price of labor by competing for you, and in the competitive extreme you get paid your marginal product, because if someone underpays you, a neighboring employer recruits you away. Andrey: Great — so you can continue. Ioana: So when I was at the DOJ, I was the principal economist. In the antitrust division, the job is to enforce the antitrust laws, and they do two broad things. One is examining mergers and potentially blocking them if they lead to anti-competitive effects. The other is so-called conduct, which can include criminal conduct like literal collusion— Seth: Assassinating rival CEOs. Ioana: A situation kind of like that — which is unbelievable. You think it only happens in the movies, but it happens in real life. Economists don’t get too involved in those cases, because it’s more of a whodunit. Andrey: No fun. Let us in — we want to be detectives. Ioana: If you want to look this up, the keyword is transmigrante — a trade in used cars from the US being traded toward Latin America. There was unbelievable murder among the companies involved, around collusion. If you were a traitor to the scheme... yes. Anyway, that’s the monopsony power. Andrey: That’s one way to get monopsony power. Ioana: This is where the FBI goes — not really the province of economists. The other category is a bunch of behaviors by firms that hinder competition, the big one being monopolization — trying to remain or become a monopoly by kneecapping rivals. In my role I oversaw the expert analysis group, a team of about 50 PhDs, mostly economists and data scientists. Whenever we had a case against a company, there’d be data gathering and economic analysis to support arguments about why a behavior hinders competition and might, for example, increase prices — or, in a labor-market case, how employers hinder their employees’ ability to find another job, and so can pay them less. For someone who worked on monopsony, being able to think about how mergers should be blocked if they lead to greater monopsony power was incredibly rewarding. Intuitively, if two employers merge, that reduces competition for workers and can lower wages or degrade non-wage dimensions of jobs. That’s now officially in the merger guidelines — which was unbelievable. How often do you do research and then get to implement the thing? Favorite Cases, and Antitrust Meets AI [58:12 – 1:09:27] [00:58:12] Andrey: Is there a monopsony case you worked on that you’re particularly excited about? Ioana: There were a number. They’re described in papers we write every year reporting on finished cases, in the Review of Industrial Organization. One — I only caught the tail end — was a publisher merger. Big publishers were trying to merge, and the argument was that if they merged, authors trying to sell their books would get lower payments. The judge found it very convincing. We even had Stephen King testify about how the merger would reduce what he could get paid. Andrey: He couldn’t afford all the cocaine he needed. Ioana: Ultimately the merger was blocked, and they decided not to appeal. That was the first merger in the US blocked exclusively on a labor theory of harm — that it would lead to lower payment for authors. The other is an interesting case around small farmers who raise chickens. They work for a processor as subcontractors — small farmers, not workers, but worker-like. They raise chicken and sell it to a big integrator. There was a contractual term that if you left to work for a different company, you’d have to pay a big chunk of cash to leave. We argued this significantly restricted competition for these farmers’ services and lowered their pay, because it’s hard to leave if you have to pay to do so. We won in the sense that there was a settlement — the company said, “Fine, we’re not doing it anymore.” I did a lot of work on agriculture, because farming is often an area with few opportunities to sell labor or goods, so monopsony is prevalent there. This tells listeners that monopsony — whether there’s competition among buyers — isn’t just about workers. Workers are a big application, but it can also be a more B2B situation, where many small businesses or independent contractors sell to big buyers with market power. [01:01:30] Andrey: Shifting slightly — something people are beginning to think about is antitrust and AI. Have you thought about that? Do you have opinions? Ioana: It’s really important to stay vigilant in AI and antitrust. We’ve had prior tech giants the government has gone after — Microsoft, Google — and in these industries there can be an opportunity to monopolize. We’re not there right now, but that’s why we have watchdogs like the Antitrust Division. There’s been a lot of partnership and financing deals, which might ultimately lead to consolidation, and that should be watched in the ordinary course of antitrust enforcement. Why does this matter? We want to maintain low prices and high-quality services for consumers and businesses — and a big part of AI is used by businesses. If you want this technology to lead to greater productivity through adoption, you want to keep it cheap and good. What usually happens with consolidation is the product gets worse and prices get higher than they’d be in a more competitive industry. It’s natural for companies to try to monopolize — that’s why we have the Antitrust Division and the FTC, and also so that companies considering certain steps recognize some might not be lawful and stay away from them, preserving competition. Seth: That’s generally the argument, but sometimes we have natural monopolies. Some argue these big foundation-model builders — OpenAI, Anthropic — pouring giant amounts into training runs might be natural monopolies. Maybe we just want one company doing the one giant training run, and the right way to regulate isn’t competition policy but tax policy or some other government control. What do you think? Ioana: It could be, but it’s not clear yet, because there are still many foundation models, including outside the US — the data is out there. You need data and power to train, but you can do it multiple times if you have the resources. There’s also a difference: in the US there’s a big focus on the biggest, fastest foundation models, but in places like China it’s much more focused on applications, and there you see a lot of competition. Some company might want to have it all — “Why don’t I have all the applications?” — and that’s why we have the antitrust authority to stop that. In some situations a utilities-type regulation can make sense, but I think we’re not there yet. For now I’d take the position that we should promote competition, and we’ll see where the dust settles. If you prematurely favor monopoly, that can actually hinder the development of the technology. So I’ll err on the side of competition. Andrey: One interesting thing here: if you’re the person who thinks we should slow things down, then you should be rooting for more market power. Leopold famously argued the technology is so powerful we’re going to have to nationalize it — which again goes to an argument for more market power rather than less. I’m not saying I support this, just throwing it out as a slightly unusual difference from most industries. Seth: Energy might be similar — or nuclear power would be a different analogy. Ioana: The thing is, you can still use a model from elsewhere — whereas with energy there are huge costs of transmission lines, so it’s more limited geographically. And let me make a point related to my saturation paper that’s highly relevant. Assume there are decreasing returns — you add more intelligence, it’s helpful, but less and less helpful at the margin. If that’s the case, then the whole competition between countries is less— Seth: Then America loses, because China is better at physical stuff. Ioana: The point is rather that being first is not that important if you have diminishing returns. It’s important, but less so, because if you’re second, you’re just a little bit worse and can still do a lot of things almost as well. Whereas if it’s “the singularity, boom,” and then you’re far ahead of everyone, being there first really matters. So whether you think you’ll reach a point of explosion versus diminishing returns completely changes how you think about competition between countries — and even between different models. Andrey: There’s a subtle point: you could have diminishing returns, but the nature of military conflict is a contest — you just need the max. So it’s very different from the economy. Ioana: I feel less of an expert on military. I was talking more about economic might — if there are diminishing returns, it’s nice to be first, but it’s only— Seth: Let’s talk about economic might, because your argument is even stronger than the one you’re making. In a universe where American innovations in AI spill over into Chinese innovation — which they do, with distillation and publicly written papers — if China has the advantage in the physical, we’d want a world less constrained by the physical. We’d want slower progress. Ioana: That gets to things we discuss in our paper — how much substitutability there is between physical and intelligence from the workers’ point of view. The paper is about what happens to workers and equilibrium wages during automation versus after. During automation — assume you’re automating all intelligence tasks — low substitutability is a form of insurance for workers; it avoids some of the worst wage outcomes, especially at high levels of automation. But after, once we all work in the physical sector, more substitutability promotes higher wages and growth in the very long run. So the game is different depending on whether you’re in the short run, during automation, or after. Seth: In the very long run, you want an AK economy. Ioana: Exactly. In the long run it’s good, but in the short run it could be better not to have it, in terms of avoiding a wage decline. Lightning Round [1:10:59 – 1:19:45] [01:10:59] Seth: Lightning round. What’s the meaning of life? In your discussion of a Betsey Stevenson paper at a recent NBER session, you said that after automation takes all our jobs, something called ikigai will be more important. What is that? Ioana: It’s the idea of having a sense of the inherent meaning of your everyday activities. This is something Betsey Stevenson proposed, and I was commenting on it and thinking about examples from philosophy. Ikigai is a Japanese concept, but we have other examples in Western philosophy — the French writer Camus, and the myth of Sisyphus. You imagine Sisyphus pushing a boulder up the hill; it rolls down, and he pushes it up again. It seems pointless. However, the myth says you have to imagine Sisyphus happy: he finds satisfaction in the repetition and transcends the fact that it’s repetitive, making sense of his life by becoming absorbed in it and seeing his freedom in embracing it. It’s a very inspiring way of thinking, because in the current regime we’re so obsessed — especially in economics — with making more stuff, versus paying attention to what we already have. Seth: But then you immediately economist-brain it, because you have this amazing quote: “If there are no market-like mechanisms to encourage people to pursue their ikigai, just as wages incentivize people to work, a world without transformative AI”— sorry, a world with transformative AI but without work could undermine wellbeing. So how do we incentivize ikigai? Ioana: That’s something I want to work on — so if anybody’s listening and wants to embark on this quest, I’m all for it. Maybe you’ve read about the epidemic of loneliness. Why aren’t people getting out and doing social activities? As economists, we think in terms of cost — there must be friction costs, coordination costs. The question is how you engineer a world where those costs are lower and people actually get out, meet friends, do their gardening or their rock-pushing, instead of sitting there contemplating how they’re doing. Andrey: You climb the rocks rather than push them. Seth: We’ve got a rock climber in the room. The social planner will assign you the rock, and you will experience ikigai pushing the rock. Ioana: No, no — it’s going to be a market-like mechanism. [01:14:13] Seth: Do you want to say anything about your work on the Anthropic Economic Advisory Board? Ioana: I’m a member of Anthropic’s economic advisory board, advising them on the economic impact of AI. As you probably know, Anthropic releases data products publicly that measure how Claude is being used. Part of my role is to give feedback on what data would be helpful to release, what checks to do, how to show people what the data means and what its representativeness looks like. It’s been exciting to collaborate with one of the biggest AI companies and play this advisory role. Andrey: And we’ve covered the Economic Index on this podcast — we had an entire episode about it. Ioana: Oh, really? Nice. Seth: What are you working on these days? What can we expect next? Ioana: Right now I’m working on a project, back to monopsony — monopsony and industrial policy. Industrial policy is fashionable right now. If you subsidize a sector — the government pays to create more jobs there — one effect is that it raises wages in the other sector that competes for those workers. You increase employment here, and wages increase there. We want to demonstrate under what conditions you get a bigger or smaller spillover, and therefore why industrial policy can sometimes be justified through this argument — you’re paying to get more competition for jobs. Seth: Although there would be a negative spillover from the taxes or regulation needed to support it. Ioana: Absolutely. You put some cost in for the industrial policy, and one benefit is increasing wages in the non-subsidized sector. It’s also a way to redistribute between wages and profits — you decrease profits and increase wages in the non-subsidized sector. And profits are pretty hard to tax, so it’s an interesting instrument. We’re developing the theory — how much monopsony power yields what optimal size of subsidized sector — and thinking about applications. The big-picture point is that the public has often lost confidence in the government’s ability to redistribute effectively through tax-and-transfer. So if we can provide more jobs while also increasing wages in the other sector, that could be an interesting policy instrument — one that comes with its own costs, but worth understanding better. Andrey: Final question: who’s your favorite philosopher? Seth: And we love the way you pronounce Camus, so say it with that beautiful accent. Ioana: Who’s my favorite philosopher? That’s surprisingly difficult. I might go with Rawls — John Rawls. Seth: Beloved of liberals. Ioana: He’s done incredible work, because he incorporated considerations from utilitarianism. Whether you agree with his take or not, he clarified a lot about the different theoretical frameworks for thinking about social justice. When I was growing up intellectually, it was very helpful. Actually, I learned about utilitarianism first — I read Mill. Oh, I love John Stuart Mill. Maybe he’s my favorite, actually — because the writing is amazing, and he has such a nuanced view of the world. His book on utilitarianism is amazing. They speak very nicely to each other — so maybe I have to say John Stuart Mill, which is fitting for an economist. Andrey: Tyler Cowen thinks he’s the best economist ever. So you’re in good company. [01:19:45] Seth: It’s been an absolute pleasure to have you on the podcast — I had so much fun. Ioana: Thanks so much. It was great. Andrey: This was awesome. Thank you. Seth: All right, everyone out there — please like, share, subscribe, and keep your posteriors justified. Get full access to Justified Posteriors at empiricrafting.substack.com/subscribe [https://empiricrafting.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Eilen - 1 h 19 min
jakson Kevin Bryan on Bottlenecks, AI in China, and What Economists Should Actually Be Working On kansikuva

Kevin Bryan on Bottlenecks, AI in China, and What Economists Should Actually Be Working On

This week we to with Kevin Bryan [https://www.kevinbryanecon.com/about.html], Associate Professor of Strategy at the University of Toronto’s Rotman School [https://www.rotman.utoronto.ca/the-rotman-experience/our-community/people/bryan-kevin/], author of the legendary economics blog A Fine Theorem [https://afinetheorem.wordpress.com/], co-founder of the ed-tech startup All Day TA [https://alldayta.com], and the man behind one of the most-discussed Twitter/X feeds in econ, @Afinetheorem [https://x.com/afinetheorem]. Kevin recently published a multi-book review of the economics of AI in the Journal of Economic Literature, and that’s where we start. Along the way we get into the gap between AI’s technical capability and its actual diffusion, the stages of how organizations adopt new technology, why the binding constraint on AI value is organizational integration (not prediction vs. judgment), what an AI-for-science research agenda should look like, the coffee test and the fence-post test, what forecasting surveys reveal about how economists and lab researchers actually differ, a dispatch from Kevin’s recent trip to China (spoiler: they are not AGI-pilled), the future of the academic paper, and a lightning round on comparative advantage in the age of AI. A wide-ranging, opinionated, very fun conversation. Grab your Chinese peptides and settle in. Links & References Kevin’s work * Kevin Bryan, “The Economic Impacts of Artificial Intelligence: A Multidisciplinary, Multi-book Review” [https://www.kevinbryanecon.com/BryanAIBookReview.pdf] — Journal of Economic Literature, 64(1), 2026. * A Fine Theorem [https://afinetheorem.wordpress.com/] — Kevin’s research blog * All Day TA [https://alldayta.com] — turn course content into a custom AI teaching assistant * Creative Destruction Lab [https://creativedestructionlab.com/] — the accelerator Kevin helps run (first AI accelerator in the world, 2016) Books & essays discussed * Leopold Aschenbrenner, Situational Awareness [https://situational-awareness.ai/] — the essay Kevin gives all his students (”read chapter one, believe chapter one”) * Erik Brynjolfsson & Andrew McAfee, The Second Machine Age [https://wwnorton.com/books/the-second-machine-age] * Ajay Agrawal, Joshua Gans & Avi Goldfarb, Prediction Machines [https://www.predictionmachines.ai/] and the follow-up Power and Prediction [https://www.predictionmachines.ai/power-and-prediction] * Joel Mokyr, The Gifts of Athena [https://press.princeton.edu/books/paperback/9780691120133/the-gifts-of-athena] and A Culture of Growth [https://press.princeton.edu/books/hardcover/9780691168883/a-culture-of-growth] — Kevin’s PhD advisor, “the Michael Jordan of progress world” People & projects mentioned * The Unjournal [https://www.unjournal.org/] and Works in Progress [https://worksinprogress.co/] — models for the “new journal” * Chad Jones, Stanford GSB [https://web.stanford.edu/~chadj/] — growth theorist read seriously by people in industry * Phil Trammell, GPI / Oxford [https://philiptrammell.com/] — “Phil World,” the rapid-growth scenario * The coffee test (attributed to Steve Wozniak) and Kevin’s own fence-post test as benchmarks for embodied AGI Previously on Justified Posteriors * Avi Goldfarb — Prediction Machines, O-Ring Tasks, and How AI is Reshaping Economics [https://empiricrafting.substack.com/p/avi-goldfarb-on-prediction-machines] * Alex Imas — Demand Collapse, Bargaining with Machines, and Behavioral AI Economics [https://empiricrafting.substack.com/p/alex-imas-demand-collapse-bargaining] Our sponsor * This episode is brought to you by Revelio Labs [https://www.reveliolabs.com/], the leading provider of labor-economics data, available to academics on WRDS [https://wrds-www.wharton.upenn.edu/]. Chapters * (00:00) Intro & sponsor * (00:39) The JEL book review: what the economics-of-AI canon got right — and what the older books still beat the new ones on * (03:19) Prediction vs. judgment, and the real bottleneck: organizational integration * (05:52) Too pessimistic on the tech, too optimistic on diffusion — Waymo, Pearl Street, and the COVID vaccine * (12:34) The four stages of how organizations actually adopt a new technology * (15:42) Status-quo bias, banning Anthropic, and treating frontier AI like nuclear material * (20:16) Why Situational Awareness beat the economists, and the book Kevin actually wants: AI for science * (26:53) Forecasting AI: the surveys, and where economists and lab researchers do (and don’t) diverge * (28:20) Benchmarks, the coffee test, and the fence-post test * (35:53) Rapid-growth scenarios, labor-force participation, and “Phil World” * (41:40) Scaling regularities: what economists should defer to technologists on — and what they shouldn’t * (43:34) Why forecasts matter for policy and capital allocation * (45:50) Dispatch from China: not AGI-pilled, “involution,” broken capital markets, EVs and self-driving * (1:01:40) War, nationalization, the end of open source — and why everyone in China uses Claude * (1:06:06) A Fine Theorem, the economics of blogging, and the rising value of taste * (1:17:48) The economist as plumber: comparative advantage, RCTs, and what grad students should do * (1:24:07) What the academic paper looks like in two years * (1:28:22) San Francisco, ambition, and the permission structure for growth * (1:32:56) Lightning round: favorite economists, All Day TA, and advice for econ grad students Open & Intro [00:00 - 00:39] [00:00:12] Seth: Welcome to the Justified Posteriors Podcast, the podcast that updates beliefs about the economics of AI and technology. I’m Seth Benzell, finally able to meet one of my theoretical heroes, coming to you from Chapman University in sunny Southern California. Andrey: And I’m Andrey Fradkin, coming to you from San Francisco. Excited to have Kevin Bryan as our guest today. Kevin, welcome. Kevin: Thanks for having me. Very excited. Andrey: Kevin is a leading thinker in the field of progress, and in AI economics. He also has his own startup, All Day TA, and is prolific on Twitter — at times. Kevin: At times. The JEL Book Review: What the AI-Econ Canon Got Right [00:39 - 03:19] Andrey: Kevin, you wrote an article reviewing several prominent books on AI. Why did you do this, and what did you learn from the exercise? [00:01:13] Kevin: It’s pretty interesting. Economics of AI is not that new of a field — some of the canonical books on how economics thinks about AI go back to before large language models existed. Books like The Second Machine Age by Brynjolfsson and McAfee, and Prediction Machines by Agrawal, Gans, and Goldfarb. These are pre-LLM — written before the attention paper. So it’s interesting to look at what of the core ideas in the economics of AI have changed given the technological improvements. On the technology side, I don’t think there have been massive surprises for people who were paying attention. At least since the scaling law paper, if you’d drawn the line on the graph, you’d have more or less predicted everything that happened. I remember reading Kurzweil — The Age of Intelligent Machines, The Age of Spiritual Machines — back in college, and those are just drawing different lines on the graph, in that case based on compute, and we’re getting very close to what actually happened. Likewise on the economic side: given that the technological trajectory hasn’t changed much, I don’t think the underlying economics has changed as much as people might think. Where things might be bottlenecked, how technology improvements map into growth, the effects on labor markets — the fundamental microeconomics of AI’s predictions hold up pretty well. I found it interesting how few of the 2023, 2024, 2025 books had really advanced my understanding of the economics of AI compared to the older ones. Prediction vs. Judgment, and the Real Bottleneck [03:19 - 05:52] [00:03:19] Seth: Lots to unpack. We just had Avi Goldfarb on the podcast and pressed him on his Prediction Machines approach, where he distinguishes the AI that’s good at predicting from the human that’s good at judging. If any of these books would have changed after gen AI, it’d be that one. Don’t you think that book maybe gets something wrong? Kevin: I think they’d agree — they wrote a follow-up in Power and Prediction. But the disagreement isn’t about the prediction-versus-judgment distinction. Even in the original book — and I remember talking to them about this in 2016, 2017 — judgment is a sliding scale. Take the umbrella example: I know my utility function on an umbrella, I know how much I dislike rain. I give the AI data, it looks at my face, sees light rain, heavy rain, and it can predict my utility function — in which case judgment is taken over by AI. Everyone understands that. That said, on the scale of how easy it is to figure out the underlying utility function from data versus the predictions that go into it, I don’t think that’s changed. None of the major language models technologically can — or even attempt to — modify how they operate for me versus you. They store a little memory and RAG their way into remembering what you’re like, but there’s no attempt to fine-tune the model. We’d like to use continual learning, but we can’t yet. So the judgment aspect is still pretty binding even today. Where I think there’s a difference — and where Ajay, Avi, and Josh would say they were wrong — is that the fundamental problem for AI’s creation of value isn’t prediction versus judgment. It’s the organizational integration problem. There’s overlap between the two, but we’d take the organizational and architectural bottlenecks more seriously now, partly because we’re applying AI to more complex tasks where those bottlenecks start to bite. Too Pessimistic on Tech, Too Optimistic on Diffusion [05:52 - 12:34] [00:05:52] Seth: You point this out with The Second Machine Age — Andy and Eric’s world-historical automated car ride. Andrey: It’s weird to think that in some ways they’re a little too pessimistic about the technology, but a little too optimistic about social diffusion. The driverless cars going down the highway in California are a perfect example. Kevin: Such a good example. We all talk to different audiences. When I talk to policy people, I tell them: “Whatever you think the capabilities of AI will be in the future — more than that.” This isn’t a sales pitch. Every single person inside the lab agrees. You have people high up in government who think about AI as the AI of today plus epsilon. And you want to ask: what did you see in the past 10 years that makes you think this is a good way to plan for the future? [00:07:01] On the other hand, out in California they wildly underrate diffusion friction. I give the Waymo example: if diffusion is so easy, how come we rode in a Waymo 10 years ago? I’m in Toronto — Jeff Hinton’s city — and there’s not a single one. Clearly there’s some friction. I remember a couple of years ago, Andrey was with us at one of the labs with a few other economists. They brought in a bunch of computer scientists and asked, “What’s the effect on GDP productivity in the short run?” And we said, “Through 2030, maybe 1% per year.” Actually we said less. And to be fair, from 2024 to 2026, we’ve been right so far. They said, “But why?” And we said, “We agree with you technologically.” At the time we saw technology that hadn’t come out yet that everyone now thinks is amazing. But on the production side you’ve got bottlenecks — you’re combining complements in some CES or Cobb-Douglas function, and you don’t need many bottlenecks for growth to stall quickly. Then there are social diffusion factors, regulatory factors, organizational architecture factors, like in Kim Clark and Rebecca Henderson’s work. Add all these up across every technology ever, and I just don’t see fast takeoff. Honestly, I think it’s bad. I think we’ll be able to make personalized medicine very cheaply much more quickly than regulators will allow you to sell it. That’s a problem. Andrey: The standard retort is one of two things. One: it’ll be so self-evidently good that people will find a way to take it — like a cure for cancer. There are already sub-treatments where rich people take them without FDA approval and claim they work. Two: we have autonomous zones where we let AI do whatever it wants, and they out-produce the rest of society. Kevin: Great arguments both. Out here in San Francisco you’ve probably got a bunch of Chinese peptides in your fridge. [00:10:20] But here’s the thing — the Chinese peptides are self-evidently really good at the single most costly part of the medical system. And yet can you legally buy them? Anywhere? You can’t even legally buy them in China. That self-evidence did not change the regulatory process. We just went through COVID. We had the vaccine in January 2020. Everything from then until it diffused was government. What more important thing to diffuse quickly could there be? Waymos — you sit in one for one minute and it’s obvious it’s driving more safely. Andrey: People start crying. It’s so beautiful. Kevin: And yet. We saw this historically too. If you want self-evidently useful: Pearl Street, 1882, Edison flips the switch — let there be light. And yet look at the diffusion rate for electric street lighting, especially after Chicago burns down. The Four Stages of Organizational Adoption [12:34 - 15:42] [00:11:49] Andrey: Let me retort. Diffusion of technology has accelerated over time. The smartphone diffused extraordinarily quickly. Aspects of AI have diffused even quicker — by standard adoption measures, almost everyone has used AI at least once. Seth: And how much do regulations matter for the diffusion rate? If you ask my students, regulation against using AI hasn’t slowed their adoption at all. [00:12:34] Kevin: We have an answer to this — it’s actually easy. This is why I like starting with a little organizational economics. Every technology I’m aware of, ever: first we adopt it when individuals can do existing tasks with the new thing more efficiently. That’s easy — that’s your student cheating on their exam, the coder using it for coding, me brainstorming with GPT to prep for a meeting. The next step is a group or an organization doing an existing task using the new technology. That’s tougher. To give you a sense — how many organizations have changed their IT procurement policies given the change in how we make software? Find one. The third stage: a new task that’s now efficient given the new technology, inside my organization. I haven’t run into a single large incumbent organization that has reached this level for anything important. The fourth is the hard one: a new task that’s only efficient because of the new technology, and that requires something on the outside — partners in the supply chain, regulators, someone to change. That’s Waymo. That’s containerized shipping. That’s UPC codes. The fundamental barrier there isn’t information or firm growth rates. It’s that the institutions are built around existing skills, promotion policies, and so on. If AI made the optimal university 50% capital and 50% labor — bringing people in and out instead of having a tenure system — what year do you think we get that? You can maybe out-compete the incumbents, but I don’t think Harvard’s reputation goes away that quickly. And if we’re talking about governments, you can’t even out-compete them. Status-Quo Bias, Banning Anthropic, AI as Nuclear Material [15:42 - 20:16] [00:15:03] Seth: Sometimes you can out-compete governments — it depends how crazy a takeoff we’re talking about. I talk to Phil Trammell about scenarios where the world is too decadent and we don’t save enough, so we don’t get growth from AI. His comeback is always: then one country will accumulate and overwhelm all the others eventually. Andrey: There’s also a timeline over which universities get out-competed — maybe not Harvard. Harvard’s a luxury good, and luxury goods have different economics. But if the ROI to college falls drastically, I don’t see college education remaining anything other than a luxury or niche thing. Kevin: We see this in the X-inefficiency papers, or the steel mini-mill papers — quicker organizational change in response to existential threats. It’s almost worse when the organization has rents to share, because then who wants to be the manager who’s the jackass firing people? My favorite example: Blockbuster could have bought Netflix for tens of billions. If they had, you’d have never heard of Netflix — I don’t think the transition to streaming happens if retail-location experts are running it, and the relational contracts with the studios have to change wildly. Someone would have out-competed them eventually, but it would have taken longer. Something like self-driving cars — let’s make a bet. Of the top 500 cities in the world in 2035, how many differentially regulate self-driving cars on safety in a substantial way compared to human drivers? Andrey: All of them. Kevin: All of them, of course. Outright ban self-driving cars — I wouldn’t be surprised if that’s double digits. Andrey: Boston almost did it, as far as I can tell. Very close. Although it’s one of those things — kind of like Uber, which entered as a banned entity and got so much consumer goodwill that politicians had to allow it. I’m not sure that happens with self-driving by 2035, but it’s not obvious when it tips. [00:18:21] Kevin: It’s also not obvious we don’t get differential regulation — say, regulation that makes self-driving cars subsidize the insurance rates of traditional cars. The world is status-quo biased. Institutions exist because they won the Darwinian struggle to survive, so they’re well-fit for the environment they operate in, which makes them inherently conservative. That’s not crazy — but at a time of big disruption like AI, you have to take it seriously. How many high-up people in Silicon Valley thought the US government would ban Anthropic? I agree it’d be insane to do. Nonetheless, I’m not surprised. If you think any government is going to allow open sales of AI at the frontier in two years, you’re deluded — they’re going to treat it like nuclear material. If you don’t believe that, your vision of the world is way too technological and not nearly organizational enough. Why Situational Awareness Beat the Economists; AI for Science [20:16 - 26:00] [00:20:16] Seth: Let’s wrap up the JEL article. In some ways you review Situational Awareness in a positive light compared to what the economists wrote. But it’s a narrative essay, not an economics book. What’s the economics book you want to read, and why are economists stuck? Kevin: Getting Situational Awareness into the book review took a little persuading — for one, it’s not a book. But if an economist asked me for the one book chapter that best explains what’s happened in the last few years, I’d say chapter one of Situational Awareness. I give it to all my students. What I want to read — both as a book and as research — is the endogenous impact of AI on science, including via robotics and via self-improvement. That’s the whole game. Seth: You point out The Second Machine Age misses this. It says “imagine a billion researchers,” and they imagine Africa getting the internet, but they don’t actually model it. Kevin: Exactly. Think how many papers go: “Here’s 2025 AI. I run an experiment where I tell you to do something that takes 10 minutes of work with AI, and I measure the treatment effect.” Who cares? Nobody’s reading that paper in five years. What people will care about is: are we getting self-automated science? Is blue-collar work being affected? If AI can do most of the research on the next AI... I always ask high-up people in the labs: what year do you think a Chinchilla-law-level result — in terms of its importance to developing the next model — comes from AI? The answers are between 2027 and 2029. I’ve never gotten 2030. That’s not all research done by AI, but it’s a substantial speed-up beyond just writing the code faster. Andrey: We already see that in math — it’s proving things humans weren’t proving. Kevin: If something like Navier-Stokes is proven, I’d put that in the set of a Chinchilla-law-level result done autonomously. And if it can do that, presumably it can do research on sensors, actuators, batteries — and then the robots improve more quickly, and we get automated labs. That’s the takeoff question. Everything I’ve said doesn’t actually imply a takeoff. It depends what the bottlenecks are. Measuring those bottlenecks — the production function for specific areas of science and robotics — is incredibly high value for knowing where to allocate resources. Andrey: And if we correctly predict them, maybe they won’t exist. It’s a feedback loop. Seth: Ooh, I love this. Kevin: Think of a production function — it tells you how much capital and labor to maximize production. I want to know what tools. Seth: Kevin, I just did it — it turns out it’s energy. I looked into the future, it’s energy. Are you saying the best book about AI economics is just a book about energy? [00:24:48] Kevin: It’s plausible. I help run Creative Destruction Lab — we were the first AI accelerator in the world in 2016, and we also run, I think, the biggest space accelerator. I was just down in Texas with astronauts for the Artemis launch. When you hear Elon talk about AI and space, it’s on the one hand crazy, on the other hand basically unregulated, effectively unlimited energy — and for training, who cares about latency? It’s not totally crazy that one way we get around the energy bottleneck is solar sails and ideas like that. In which case we face other bottlenecks. But this is an empirical question, and one where you’d want energy economists and energy experts, not just labor economists. Forecasting AI: Surveys, Economists vs. Labs [26:00 - 28:20] Andrey: One thing where I feel very stupid: about six months ago people around here kept saying “energy shortage, energy shortage,” and I thought they were probably right but didn’t trade on it. You’re also involved in a project — we discussed it a bit with Avi Goldfarb — figuring out what economists are forecasting about the future of the economy under different scenarios. Tell us about it. [00:27:06] Kevin: There are actually two projects — one I’ve been involved with, one I’m an academic advisor on. They both also ask AI-lab researchers, superforecasters, and the general public. The most interesting thing, as far as I’m concerned: on technical projections, there’s really no gap between the economists and the people inside the labs. And on economic projections the gap is also pretty small. If you go from Acemoglu to Dario Amodei in our sample, Acemoglu is like the 1st percentile and Dario’s like the 99th — and neither is really representative of economists or AI researchers. It’s important to put these projections on paper and see how we did. Some surveys are now old enough to check. The projections of everyone — economists and non-economists — on frontier math were low. We were thought to be crazy with some of these projections, and we still underestimated the rate of improvement on certain benchmarks. People say “it’s a benchmark, they trained to it.” The problem: I wrote benchmarks for one of the big labs. You know how hard it is to write a benchmark the AIs can’t solve? I did some in March. I’m running out of questions I can ask them. Seth: They know how many R’s there are in strawberry now. Benchmarks: The Coffee Test & the Fence-Post Test [28:20 - 35:53] [00:28:43] Kevin: I have some tricks, but who knows how long they’ll last. Honestly you need benchmarks that look like the coffee test — or my favorite, the fence-post test. The fence-post test is mine: I can buy a general-purpose embodied AI that I can tell on a Saturday morning, when I want to sleep in, “Go to my backyard and dig that fence post.” Not a specific machine — a general one. Every human could in principle do it. I think we’re quite a ways from AI doing it at cost. The coffee test — I think this comes from Wozniak — is that an embodied AI walks into three random houses it’s never seen, finds the ingredients and the mug, and makes a cup of coffee. Well within the capability of any normal person. Whoever came up with it said the year it’s possible is “never.” We’ve done surveys — the modal answer from researchers now is the early 2030s. I think that’s the kind of benchmark you need, because anything on paper or on a computer — what Shane Legg calls minimal AGI — the goose is cooked. We can’t write tests I’m confident AI won’t pass in that domain. Seth: Andrey just wrote a test the AI was very bad at. Andrey: It’s really bad at predicting how many tokens it’ll use for a given task and whether it can actually do it. It’s poorly calibrated. Kevin: That’s a well-known one, included in some benchmarks on AI’s ability to self-reflect. But it’s in the set of things where if I think about it a bit, I don’t know any reason I can’t hill-climb to answering it — ergo it’ll get solved. Andrey: To be clear, our paper’s call wasn’t “we need these for economic activity, so please RL on them.” Kevin: That’s essentially what you need, though. Anything obvious you can hill-climb on is cooked. Anything non-obvious but complementary to you hill-climbing on it is cooked. I need something outside that set. Seth: But it has to be at the intersection of hill-climbability and being economically valuable to hill-climb. Or do you think we’ll saturate everything even if it’s not valuable? [00:31:46] Kevin: I don’t think there’s a difference. Once I use AI in adversarial or competitive settings, making a mistake 0.1% of the time screws you. Edge cases are really bad in adversarial settings, and lots of economic activity has that flavor. Seth: There’s no such thing as an economically unimportant question. Kevin: Right — if you give me the economically unimportant question, I’ll design the economic interaction to screw you on it. Before LLMs we had GANs — you could put a sticker on a stop sign that fools any model but looks identical to a human. There are good statistical reasons we’ll never fully solve that. My favorite one AI has trouble with: they took an outline map of Europe, filled in part of the Bay of Biscay as if it were land, put an arrow on it, and asked “what’s here?” If you know your geography you say, “that’s the Bay of Biscay, oddly colored like land.” It’s just a weird thing for the training data to see. Economically it’s not per se valuable — most maps you see are the real map. But if I were using AI in a financial system, I’d be super concerned about my inability to solve that. Seth: It’s very important to be able to draw pictures of wine filled all the way to the brim. Kevin: Especially for these evening podcasts, Seth. [00:34:14] Andrey: Back to the forecast — one question about the composition of people. I’m a participant in your surveys. I wonder if the economists are all our friends and not the skeptics, like Acemoglu. Seth: We’re gonna get him on. Kevin: It’s not just our friends. The selection mechanisms differ between the two, but you have to have published something related to AI at some point — and plenty of people who’ve published on AI are quite skeptical. It’s not snowball sociology; the selection mechanism is completely public. Andrey: But there’s self-selection into participating — I do it because I’m very interested in AI economics; some might not. Kevin: For sure, that’s an issue. Forget econ — I was just at a faculty association meeting talking to the humanities people. It’s amazing: the AI is simultaneously destroying society and can’t do anything. Very hard to hold both at once. Seth: A very prestigious combination. Rapid-Growth Scenarios, Labor Force, and “Phil World” [35:53 - 41:40] [00:35:53] Seth: You said there’s not much difference between economists and non-economists on economic predictions, but my recollection is there are substantive differences — like the fast-progress scenario, a percentage point of GDP growth per year difference. That’s sizable. Kevin: That’s where the biggest difference is — the rapid-growth scenario: widespread inexpensive robots that can do basically everything. Call it “Phil World,” since we talked about Phil Trammell — friend of the podcast. In that world in 2050, saying 1% growth per year is a little crazy. It’s hard to write down a model with bottlenecks that strong. Seth: Or there could be dis-saving — people taking their labor out of the economy. We asked about labor-force participation, and even there the gaps were off-trend by five or six points. Not enormous. Kevin: That seems small for that magnitude of change. But the description of “rapid AI” was technological capability, not diffusion. One explanation: it’s possible to do this, and we ban robots. Seth: For my prediction I included increased chance of war as something that reduces growth. Kevin: We had a couple of respondents say zero GDP because we’re all turned into goo. We won’t say which of our friends. For the rapid scenario, the 25–75 bounds are stupidly high. But for the AI all three of us would expect, the error bounds aren’t enormous — people were generally on the same page across groups. Most of the difference was within-group until you get to 2050 and rapid AI. [00:39:10] Seth: Give the listeners some numbers for the median scenario. Kevin: The best comparison is something like CBO or IMF projections — on the order of one percentage point more productivity, one point more growth per year. Which adds up to a lot — let’s say it adds up to the single most important invention in human history. On labor-force participation, about half a percentage point more per year in the drop — substantial out to 2030, not quite as big by 2050. Big effects, but... Seth: It’s not the singularity. One percentage point additional growth a year for 20 years is the difference between two high-income countries — not the difference between the Flintstones and the Jetsons. Kevin: I understand the objection: you read Situational Awareness, and from 2001 to 2026 the AI-pilled people were right and everyone else was wrong, so don’t bet against their projections. Fine — on technical grounds, say they were right. My response: I’ll literally take, as my technical projection for 2030, whatever the modal response from researchers inside the labs is. On what grounds would I disagree? But how that maps into labor-force participation — there I wish some of these people would close their mouths. Scaling Regularities & What Economists Should Defer On [41:40 - 43:34] [00:41:14] Seth: Let me ask about a techno-social prediction. We have this regularity, the scaling law — which you said should really be called a scaling regularity, because we’re not sure it’s a law of nature. The relationship between error rate and number of parameters seems technical. But then there’s the sociotechnological leap — that scaling leads to scaling capability. Should economists defer to technical experts on that, or is it a socioeconomic prediction we should have an opinion about? Kevin: That one’s in the middle — related to AI for science. The scaling regularities — let’s say four of them — we just take from the computer scientists. But what’s the production function of medicine? How important are improvements in predicting protein structure to making a new drug? That’s not economics in the sense that we don’t have the field expertise, but it’s also not biology and not computer science. We’re in the middle. Seth: In Aschenbrenner there’s a figure: right now it’s high-schooler level, in a year college, then professor. Is that the first kind of prediction or the second? Kevin: That’s the first. I take that from the computer scientists. I want field experts and economists to estimate the production function, and social scientists to work out the implications on other parts of the economy. Why Forecasts Matter for Policy & Capital [43:34 - 45:50] [00:43:34] Andrey: Let me retort. People are interested in forecasts, but I don’t think economists are very good at forecasting. And it’s not clear how useful the whole exercise is. I could build my own custom macro model to answer these surveys — how much value to society would there be? Or is this more an exercise in social consensus, to bring to policymakers and say “here’s the range of expected outcomes,” without caring about the specific forecasts? Kevin: A bit of both. Take chapter three of Aschenbrenner. If I believe that forecast, the government should borrow literally everything it can and plow it into chip production — because if your growth rate is 10% a year, who cares? So it matters a lot for policy. On a micro level: I’m an executive at Google deciding whether to put money into AI math solvers or into bio — Anthropic just put Novartis’s CEO on their board. Which improvements lead to value more quickly? And at the organizational level, if I’m a university, I need to know which bottlenecks are in my control and where I can just free-ride and wait. When you talk about China, I’ll tell you something interesting I learned there: they’re not AGI-pilled. I think that’s going to cause problems — but we’ll get to that. [00:45:56] Andrey: The final thing: yes, Anthropic is going into bio, but you don’t need forecasts for that. Just look at the share of GDP in different sectors. Economists are valuable, smart people — but using AI for medicine is the most obvious thing in the world; I don’t need an economist to tell me that. Kevin: The marginal value comes elsewhere: if I spend $10 million figuring out how to allocate $10 billion of capital, that’s really high value. And on policy — listen to how policymakers talk. Bad predictions about the labor market coming out of some labs are going to cause regulation. States are going to ban data centers. We’re going to tax all the compute before we get the cancer drug.I was working on a theory problem this week: I care about the wage bill — I want AI to be as productive as possible without harming wages. So you take something like Chamley-Judd, add a wage-bill constraint, add informational constraints for the planner about which capital is AI, let it substitute and complement in various ways, and solve. The result on taxation looks nothing like anything being proposed right now. To know that’s the right way to think about it, you can’t just say “AI will be useful in the future.” No — they’re going to ban it. Andrey: This political economy of AI is something I’m tracking very seriously now. It’s obvious we’ll have bans and regulations long before AI actually has effects. People already think AI is causing mass unemployment. Kevin: They’re immune to the data. “Block laid off 40% of their workforce.” It’s a bad media environment, too. A reasonable hypothesis: the sector most harmed by digitization and then AI is journalists — so young journalists, especially culture journalists, are incredibly hostile to AI, and the world they influence ends up asking “unemployment’s 4.5%, why is everyone talking about this?” There was an article this week about young people who don’t want kids because it’s too expensive — and the first couple they showed were 25, owned a 2,000-square-foot house, and the husband’s hobby was golfing in Utah. It’s a bad epistemic environment, and it’s bad for AI because it makes people hostile to change — they feel they have to protect what they have, even though the economy roared. China Trip: Not AGI-Pilled, Involution, Capital Markets [45:50 - 1:01:40] [00:50:29] Andrey: Let’s get your take on your China trip. What was the occasion? Seth: Is China AGI-pilled? Why or why not? Kevin: We need that one for the clip at the start of the video. I studied diplomacy — my goal when I was younger was to join the Foreign Service. I worked in China briefly at the embassy in ‘05, around WTO session time, and I’m back there quite a bit. After COVID, the number of foreigners in China dropped, so the information flow is bad. A colleague calls it the G2 when it comes to AI: two countries, plus Google’s London outpost. Nothing else really matters for AI. So not knowing what’s going on in China is really important. This year I brought a group — economists, a guy from Epoch AI, a trade lawyer. I wanted to understand robotics, especially in traditional industries. AI’s effect on most of the market won’t come through San Francisco or Hangzhou. We met Zhipu’s COO, journalists who work on AI policy, startup founders, cloud providers, the biggest angel fund. We also went to Dongbei, the northeast — the fastest-falling population region in the entire world, losing about 1% a year, maybe 100 million people. We went to the one city that’s hanging on. [00:53:34] First thing: nobody we talked to was AGI-pilled. When you ask what the AI is for, it’s completely about process engineering of existing industry. That’s it. Why open source? Process engineering. Why build your own non-frontier stack? Process engineering. And they actually use it in industry — some examples looked better than what we see in the West. But no one talks like the San Francisco or London DeepMind folks: “in 2029 my robot flies through the air and shoots the robber and delivers my peptides.” It honestly felt like talking to government people — “AI’s capabilities in 2026 plus epsilon.” Part of what’s going on is a word in Chinese they translate as “involution” — I always tell them it’s not a word in English; it actually comes from Clifford Geertz, the anthropologist. It means extreme competition. It’s very hard to make a profit in certain industries — a hundred entrants immediately when you start making money. So high-fixed-cost, payoff-in-the-future investments are really hard. You only see it from things like DeepSeek, where it’s a hedge fund and the guy spends his own money. Even companies that seem to be doing great — the independent AI producers, not the Alibabas and Tencents — are in massive financial trouble, because it’s too competitive. Seth: Part of that’s the interest rate and capital-market environment, right? American AI companies can lose money for a long time — why can’t they access money for more runway? Kevin: China’s biggest advantages are energy costs about half of ours, and a much stronger hardware ecosystem — your ability to experiment and prototype blows away North America’s. It’s probably not even worth running a battery or robotics-hardware company here; you’ll get swamped. Seth: Unless you’ve got a government contract. Kevin: True — we should probably build our own drones. But things that require big fixed costs and have long payoffs need deep capital markets that reallocate capital quickly, and China doesn’t have that. The VC market is worse than a decade ago — foreign VC basically left. Most companies get investment from state-linked banks or rich people out of pocket. DeepSeek is trying to raise $20 billion; if they were in San Francisco they’d start at ten times that. [00:57:06] Andrey: Let me play devil’s advocate. So far most of the rewards go to frontier models — you can’t charge enough for non-frontier tokens. So DeepSeek doesn’t make sense unless it’s a government-funded national champion. Kevin: If DeepSeek weren’t in China, with their leadership and computer scientists, they could have attracted the Chinese equivalent of Alec Radford and Ilya Sutskever and been in the race for the frontier. They can’t, because of the capital markets. This isn’t just AI — all sorts of industries face it. They can move quickly when the design already exists, but for “I’m doing something genuinely new,” they’re behind. Self-driving — they’re behind Tesla. Not Waymo, Tesla — even the frontier Chinese car companies. Andrey: That’s crazy to me. I’d have thought they’d have a separate, more generous government lane. Kevin: Look at what Google had to spend to build Waymo — no one else in North America pulled it off, because you needed to lose tens of billions and there wasn’t enough capital. Andrey: In China labor is cheap, so the economics of an autonomous-vehicle service are worse there. But modern neural networks made AVs a lot easier — Google couldn’t really have done it before 2022. Kevin: An executive at one of the new Chinese car companies told me that in China, Elon’s strategy is seen as smarter than Waymo’s — they think Waymo’s approach is out of date: LiDAR is cheap now, don’t map the roads. Maybe they’re right and catch up. On the cars themselves they’ve caught up — if their cars were sold in North America they’d take the market. And it’s not the traditional four — the Ford and GM of China are also screwed; the architectural shift to electric was too hard. It’s the new companies that would crush us. But they still haven’t caught up on self-driving. [01:00:11] Seth: Follow-up on capital deployment — bringing Leopold back. He thinks the big frontier labs end up as nationalized projects. China can deploy a lot of capital toward national projects. Do you see this disadvantage reversing if we get one big national lab per country? Kevin: Good question. Hasn’t happened yet. I think they’d have the same problem — China has hippies now. They have words like tang ping, “lie flat” — I’m not joining the rat race. They have guys like the people at Anthropic wearing sandals and reading the Whole Earth Catalog. Those people, in the US and China, aren’t going to work for some state-backed project. You can maybe state-back the energy rollout, but it wouldn’t attract some types of talent. War, Nationalization, the End of Open Source — and Claude [1:01:40 - 1:06:06] Kevin: The part of Situational Awareness that seems like it must happen — I wrote my PhD dissertation on early nuclear. Back then you literally weren’t allowed to publish your patents — state secret. We’re very close to wars where AI plays a major role. At that point, who’s going to let this stuff be independent? The government doesn’t let you sell missiles — they’ll let you sell to partners they approve, and that’s it. Seth: Does that mean the end of open-source models above a certain size? Some sort of IAEA for AI? Turing police monitoring frontier labs under UN auspices? Kevin: When people talk about UN regulation of AI — take a foreign-policy class. Neither China nor the US cares one whit what the UN says. There’s going to be an organization called the G2: the US president and the Chinese premier talking to each other. That’s how it’ll work. Open source is interesting — it’s a little bit dying in China. The most well-known researchers at Alibaba quit. A couple of other well-known model makers are going to go bankrupt — it’s not obvious how you make money making open-source LLMs as an independent. I suspect Llama is the last big one Meta makes. Someone will make them — NVIDIA’s pretty clearly going to try, because it’s such an obvious complement. But you can imagine a world where open source becomes much less common. [01:03:32] One interesting thing: talking to people in AI in China — not political people — every single person thinks Claude is the best model, and they all use Claude. Not domestic models. Even though it’s very hard to do that from China, on both the government and the Anthropic side. There’s no opinion that China is catching up on AI. The view is that not only is Claude ahead, but the one place they a bit believe the AGI pill is that inside OpenAI, Anthropic, and DeepMind they’re using these models to speed up product deployment — and China doesn’t have the same access to frontier models, which makes it tough. I’m doing a thing for NBER on what chip bans would do to endogenous innovation in China — how to even model that isn’t obvious. The cynical answer is it’s whatever the marginal cost of buying chips from Kazakhstan is — one more plane flight. Seth: I was reacting to the Jensen interview. We’re half a beard away from you being at that point. Kevin: I should have worn the leather jacket — that’s the look now. Actually, the real move is the T-shirt from our machine-learning accelerator, before we called it AI, back in 2016. That’s the one you flex with. A Fine Theorem, Blogging, and the Value of Taste [1:06:06 - 1:17:48] [01:06:06] (For those playing along at home, now’s your chance to think about how this conversation has changed your priors — sponsored by Revelio Labs.) Seth: Revelio Labs is a leading provider of labor-economics data and data services for companies, academics, and independent researchers. They combine comprehensive micro-level data on employee profiles, job postings, and sentiment with standardizations, mappings, and enrichments — flexibly aggregated to company, market, or industry — to study everything from career trajectories to occupational transformation to the impact of AI on labor demand. Their data is available on WRDS, so if you’re an academic with a good library, check whether you already have access. If not, reach out to their economics team. [01:07:21] Seth: One thing you didn’t mention at the top: the reason you’re so close to my heart is your famous blog from the glory days of econ blogging, A Fine Theorem. When I started my PhD in 2012, getting excited about the big questions in economics and how theory can contribute, I found it so inspiring. So much of how you publish in econ now is: find a cute IV for one of a limited list of subjects, or — God willing — J-PAL backs you and you do an RCT. That may be useful, but it’s not what excited me about economics. Your blog was my north star for how technical theory can and should be communicated. So, snaps for how cool that was. Kevin: Hold on — who do you think the Gen X is in this conversation? Seth: Are you an elder millennial? Did I just mess up? Kevin: I thought I looked young for my age — I’ve got the dimples. Seth: As a generational-conflict theorist, the thing that struck me about the Dwarkesh–Jensen interview was Gen X shape-rotator Jensen and millennial wordcel Dwarkesh. So it wasn’t surprising you had the leather-jacket option. Kevin: I’ll say the Gen X has excellent taste in music. I went to the Oasis reunion concert — probably the youngest person there. It was great. [01:09:42] So, A Fine Theorem. It’s related to AI development, believe it or not — one important way new technologies diffuse is the development of complements. That site started as my PhD notes on the papers I was reading; it was just easier to keep them in a WordPress setup. Some people found it through RSS — that’s how you found things on the internet then. Now people find things through gated social media, group chats, podcasts. It was good timing for me. I was never that interested in running a podcast — someone asked me to do one on the economics of science years ago — writing just matches my background better. It got a bit wild. I’d write about maybe a hundred papers a year, plus Clark Medals and Nobel Prizes. I had a reputation as the guy who reads everything across fields and isn’t shy about his opinions. The three craziest emails I’ve ever gotten: I proposed a reform to the NBA and the president of an NBA team emailed me to talk about it. And two different Nobel laureates read my notes after they won and wrote asking me to read through their Nobel speeches. That’s the coolest thing ever. Seth: “Explain to me why my work was important.” Kevin: It makes sense — you know your work, but not always how people see it or how it influences them. I go to conferences and students will say “I’m extending your paper from 15 years ago this way,” and I’ve completely forgotten about that corollary. They know more about it than I do. Tyler Cowen liking it led a lot of people to read it. At one point it got, I don’t know, a million views — crazy for a microeconomic-theory blog. [01:12:51] Seth: I do think you’re quite good at writing it for an educated reader, not just as a paper. There’s a big latent market for this — previous guest Noah Smith works the same lane. We love Noah, but you can’t compare Noah to Kevin in terms of gravitas and depth. Kevin: A lot of academics think their job is research and teaching — writing papers for other academics and maybe policy folks. But now I know who’s reading that stuff. I was writing about epistemic game theory, and serious people read it. My work on progress studies — I teach a class on progress with serious research behind it — there’s huge interest. I was at a conference with Chad Jones, the growth theorist, and there are people in industry reading Chad Jones papers seriously. The world is much more interested in serious work that answers serious questions than academics think. If they understood that, they’d be more careful with their work and would choose different topics — instead of “I’m writing this because journal editor X just got promoted.” The Economist as Plumber: Comparative Advantage & RCTs [1:17:48 - 1:24:07] [01:14:24] Seth: Let me ask about the how and who you write for. One theory behind this podcast: as the marginal cost of writing papers goes down, the marginal product of reading them can go up. Do you see AI increasing the relative importance of digesting and synthesizing research? Kevin: The one-sentence version you hear — which I think is true — is that the marginal value of taste has gone up. Seth: But what’s taste? Kevin: There’s stuff that’s fun to consume — I watch YouTube golf like everyone my age, but I know I’m not learning anything; I should be watching topology videos. Taste is understanding why a thing matters. Show me 20 things written about chemistry and I can tell which is better written, but not which one matters. To have taste — in music, literature, economics, anything — you need a really strong epistemic base. AI can point out “this is a good paper,” but not “this is a good paper in line with your individual interests.” Maybe in a world with continual learning, where your AI is your assistant — but we’re not there. Seth: But it was beyond what was interesting to you — somehow it was also inspiring to people like me in grad school. Kevin: Right. I’m illiterate about music — play me some Bach and I barely know the difference. But once in a while a really good critic writes “listen to this part and you’ll hear this,” and suddenly I do hear it and understand why it’s interesting. That person couldn’t have just listened to that one piece or read one book — they need to understand the history of music. People have different areas where they can have taste. Mine is probably the intersection of theory, history, and history of thought — and that mixture isn’t very common. [01:17:48] Seth: Let me pull out something you may have a distaste for — a quote from your review of the Banerjee–Duflo–Kremer prize. “The economist as plumber, famously popularized by Duflo, who rigorously diagnoses small problems and proposes solutions, is a fine job for a World Bank staffer, but a crazy use of the intelligence of our otherwise leading scholars.” React to that in the age of AI, where the market is flooded with “we estimated the productivity impact of AI adopted here on this date” papers. What should those people do instead? Kevin: You’ll be surprised — because I believe in comparative advantage. I literally mean it’s good work for a World Bank economist; people should do that. I just don’t think Banerjee and Duflo should have been doing it. Same way Stantcheva’s taxation work was unbelievable, Clark-Medal-winning — and then she wrote a bunch of papers basically running a survey firm. The papers are interesting, but it’s not her comparative advantage; many people have more expertise in that area, and it’s not that complementary with the rest of her work. Andrey: I’ll disagree. Both survey research and experiments required elite permission to do this type of work. There’s no objective, agreed-upon standard in social science for what we should work on. Having an MIT or Harvard economist legitimize it in a top-five journal lets a bunch of other people — for whom it is their comparative advantage — work on it. On the margin maybe they work too much on it. In marketing, where I sit, there was a perception that survey research with stated preferences was something we shouldn’t do — and now if a top economist says it’s okay, maybe we can. Kevin: For sure — same way J-PAL was useful, and they won a Nobel for it, so they were rewarded. I agree on the permission structure. The question is what we do now with AI. In a sense it’s not great for me — being the smart-ass kid who’s really good at algebra is worthless now. I worked on a paper recently I’d been stumped on for years — a proof I couldn’t figure out. GPT-5.4 Pro was also stumped, but in its write-up it gave me a polytope-theory result I hadn’t seen, and I used it to prove the thing. I felt like a dad beating his teenage kid in basketball — super happy, but I know it might be the last time. [01:21:56] If you’re a PhD student now and your specialty is being really good at solving models, you’re just not going to have a job — you’re not as good as the AI. But some things are incredible complements to AI: within-firm field experiments done with much higher ambition than now. Those will be very popular and not susceptible to replacement for a while. Andrey: But didn’t you say we shouldn’t be working on this? Kevin: I said we shouldn’t be doing RCTs — but I believe in comparative advantage, and we’ve changed the price of the factors. If we’re going to do this, what’s a bad idea is doing it atheoretically and ahistorically. Two things you need as a PhD student: your work has to be a complement to what AI can do, and your work has to have taste — you need to know what matters and why. A field experiment estimating a treatment effect no one cares about shows a lack of taste. Thinking you’ll get a job solving a model any AI can solve shows a lack of understanding of comparative advantage. High-paced managerial types are going to do better in academia than they used to, and some folks who were high-status will find nobody cares. Andrey: I see how the human advantage is running RCTs versus writing macro models. But what’s the right approach for writing that AI book you want us to write? Kevin: I still think you should write the macro model — your contribution just isn’t solving it. And your empirical paper needs to draw on and understand the macro models you’re building on. You should spend more time reading papers, not less, to develop taste. Andrey: Or you shouldn’t read papers — you should talk to the AI about the papers. Or listen to this podcast. Kevin: You should be listening to Justified Posteriors, brought to you by Jane Street. Andrey: We’re manifesting Jane Street. The Future of the Academic Paper [1:24:07 - 1:28:22] [01:24:19] Kevin: Academic papers are an unbelievably entrenched system, but here’s where I’m trying to go — and I edit an AEA journal, so I talk to them about how we handle AI. In a couple of years, a paper is: all the lab notes, code, and data, open and in a format AI can read — that’s already in progress. Then a paper that ranges from the 40-page version to the five-page version to a “talk to the AI” version. If you go to my website, my papers already have a built-in Gemini Flash interface, because I assume people want to talk about the paper while reading it. It’s not just a PDF. So every paper will have a partially AI-generated hundred-page version with all the information for the AI, the 40-page version, the five-page, the three-page, the interactive version — because the cost of writing the paper is so high relative to the cost of those manipulations. The idea of a paper as a fixed set of words is over. If that’s all it is, everyone’s going to talk to GPT about it anyway — we can do better. Andrey: Does that mean writing goes down in importance? Someone like Chad Jones is such a crisp writer — that’s a key reason people read him. Kevin: How AGI-pilled am I? The best academic writer in our profession is not Hemingway — let’s not be deluded; the average writer is terrible. People outside academia may not realize how much editing for readability happens in an academic article: the answer is zero. Maybe one or two sentences you’ll be asked to crisp up. It’s not The New Yorker — there’s no editor rewriting your paper for readability. The only reason people try to write well is that on the margin it raises your acceptance probability. Otherwise they write like a lawyer. Andrey: It’s taste. It’s for themselves. [01:27:05] Kevin: I had an idea — maybe we do it for AI. I wanted an innovation journal; there’s no good one, and innovation is very interdisciplinary. But no one will send a paper to a new journal, for tenure reasons. So how do I free-ride on the system? Create a journal that any already-published paper is eligible for. Have a board of 30 great innovation and AI economists; as soon as three say “if this were my field, top field journal, I’d have taken it,” it’s in the journal. We link to the working-paper version and hire a professional to write a 1,500-word, Quanta-Magazine-style article about why the paper matters. Seth: Have you heard of the Unjournal, an EA project? It has some of these ideas. Kevin: Yeah, the Unjournal’s a good one. Andrey: Works in Progress is doing some of this too — taking academic research and making a great article about it. Kevin: That’s why the innovation-econ world and the progress world have a lot in common — we’re all friends. This year I felt like a progress-world celebrity, because one of my PhD advisors was Joel Mokyr, and they love Mokyr in progress world — he’s like Michael Jordan. Andrey: I tried reading A Culture of Growth and it’s unreadable. I’ll just put it out there. Kevin: The Gifts of Athena is the one I recommend — though you have to work through 50 pages of prescriptive-versus-propositional knowledge with lambdas and sigmas. He’s still a better writer than the average economist — low bar. San Francisco, Ambition & the Permission Structure [1:28:22 - 1:32:56] [01:28:46] Kevin: My favorite thing about what’s going on in California — other than the incredible ambition — for folks who aren’t here, there’s all sorts of craziness; they make Seth and his EA beard look normal. Seth: It hosts insects and shrimp that are having a lot of utility. Kevin: You can save a little dinner for later up in the mustache. But the level of ambition — your average 21-year-old asking “what should I do with my life” aims this high. That’s not normal in most places, where the very smart people are type-A, “follow this rule and this process.” In academia we know tons of those people. It’s super refreshing. In progress world, every random person is like, “should I make money, or start a biohacking magazine that four people buy but I like doing?” — biohacking magazine. And I love it. We have a guy in Toronto, Ben Perry, who runs a sort of “Toronto society,” also in progress world — he holds talks on what makes a beautiful city. He asked me to give one related to my progress course, on idiosyncratic factors that lead to progress — a pretty out-there talk. I show up and we’ve sold out a concert hall. People paid 30 bucks a ticket, there was music beforehand, and afterward people are in the hallway chatting about what they’re building. These people are all over. [01:31:05] The remaining secret sauce of Silicon Valley is that everyone — all the way up and down the permission and capital structure — agrees the most ambitious people should have the power and the capital. That’s rare. During COVID, my university was closed and it was driving me crazy, so I went to teach in Senegal — the best university in French West Africa, teaching high-growth entrepreneurship. Great students. I asked what they wanted to do when they graduated, and they all wanted to work for the government. “You don’t want to start a company?” “If it doesn’t work out and I go bankrupt, I’m living on the street, and no one gives a 22-year-old money to start a company.” And that’s reasonable. But that societal structure makes growth impossible. That’s the thing you have to get right. Lightning Round [1:32:56 - 1:38:08] [01:32:56] Seth: How are we on time? Want to do a lightning round? And give yourself a chance to talk about All Day TA. Kevin: Let’s do All Day TA as part of the lightning round, so I don’t feel like a sales call. Seth: Lightning round, beginning. Favorite economist, living or dead? Kevin: Dead: Paul Samuelson — awesome work. Living: Bengt Holmström, because when I walk around on the street his ideas are in my head all day. Seth: All Day TA — what did you learn from being an entrepreneur? Kevin: This is my company — we sell ed-tech to universities, a hundred-plus now, all over the world. I learned that for AI diffusion, institutional sales is so hard in traditional industries, and so unrelated to product quality, that the people who already own the gates into big institutions — the Salesforces, the Microsofts — are going to clean up in the AI world. People who think they’ll sell a great product and get around those gates are deluding themselves. Seth: Did you learn a trick for selling to universities? Kevin: Did it a hundred times. The technical stuff matters very little. You have to figure out who has the decision rights — often the head of IT — and they often have some idiosyncratic thing they want. Going through the professor has no power. With any institutional sale, the secret is knowing who can write the check and getting to that person quickly. Seth: If you had to burn all of Kremer’s RCT work or his O-ring paper, which would you destroy? Kevin: I love the O-ring paper. But if your experimental papers probably saved a million lives, I have to let you keep those. So we burn the O-ring. Also, we probably could have figured out the O-ring without Kremer. Seth: But would it have been written that beautifully? Kevin: No. It’s such a nice paper. Seth: What advice do you have for folks in economics grad school today? Kevin: You’re five years out — read Situational Awareness chapter one, and believe it. Whatever you think you’re doing in your job-market paper, ask: is that consistent with creating value in the world of Situational Awareness chapter one? If not, literally do anything else. Seth: If you had a choice of joining a lab or going to econ grad school, what should someone choose? Kevin: I don’t think there’s necessarily a conflict. But when my most ambitious 22-year-old students ask what to do, I say: it’s like being a writer in 1920 — get on a boat and go to Paris and don’t be stupid. You’re an ambitious 22-year-old: get in a van, drive to San Francisco, and don’t be stupid. Seth: Seth, any more lightning rounds? No, I think we’ve covered it. Kevin, this was a completing-the-circle experience for me — your blog was so inspirational on my economic journey, and getting to talk to you and be treated as an equal was a very special moment. Kevin: It’s nice you got to talk to me before I reached my full senescence — given whatever age you think I am. It’s really ruining my self-image. Seth: I always think it’s so beautiful when millennials can get along with Gen Xers. It’s a special thing. Kevin: You know the irony? The Gen Xer wouldn’t have cared — “who cares, man, don’t worry about it.” Only the millennial complains about being called the wrong generation. Seth: That’s true. Thanks a lot, guys — keep up the good work on the podcast. I’m looking forward to the next guests. [01:37:53] Seth: And to listeners at home — keep your posteriors justified. Get full access to Justified Posteriors at empiricrafting.substack.com/subscribe [https://empiricrafting.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

1. kesä 2026 - 1 h 37 min
jakson Seb Krier on AGI, the Coasean Singularity, and EDM kansikuva

Seb Krier on AGI, the Coasean Singularity, and EDM

Seb Krier on AGI, Scaffolding, and Coasean Bargaining at Scale In this episode of Justified Posteriors, we welcome Seb Krier [https://x.com/sebkrier] — policy lead for AGI at Google DeepMind and excellent Twitter poster. Speaking in his personal capacity, Seb walks us through his understanding of AGI, why AI alignment has gone better than expected, the potential and limitations of a world where agents constantly barter on our behalf, and — of course — electronic music. We also cover AI in London vs. New York, how Seb went from reading Marginal Revolution for 15 years to becoming a recurring character on it, and Seb’s side-splitting humor on mediocre AI conferences. Related Links * Seb Krier on X: @sebkrier [https://x.com/sebkrier] * Seb’s Substack, Technologik [https://technologik.substack.com/] * “Coasean Bargaining at Scale” [https://blog.cosmos-institute.org/p/coasean-bargaining-at-scale] — Seb’s essay at the Cosmos Institute (also republished here [https://www.aipolicyperspectives.com/p/coasean-bargaining-at-scale]) * “Musings on Recursive Self-Improvement” [https://technologik.substack.com/p/musings-on-recursive-self-improvement] — Seb’s essay separating model-side RSI from societal-side * “The Cyborg Era: What AI Means for Jobs” [https://aleximas.substack.com/p/the-cyborg-era-what-ai-means-for] — Seb’s guest essay on Alex Imas’s Substack, defending the scaffolding view * Anthropic’s Project Deal [https://www.anthropic.com/features/project-deal] — the agent-bargaining experiment among Anthropic employees * Fradkin & Krishnan, “MarketBench” [https://andreyfradkin.com/assets/marketbench.pdf] — Andrey and Rohit experiment of LLMs bidding in procurement auctions as an investigation of the future of AI marketplaces and the companion writeup: Rohit Krishnan, “Agent, Know Thyself! (and bid accordingly)” [https://www.strangeloopcanon.com/p/agent-know-thyself-and-bid-accordingly] * Edge Esmeralda [https://www.edgeesmeralda.com/] — Devon Zuegel’s pop-up village in Healdsburg, CA * MATS [https://www.matsprogram.org/] — for junior economists looking to skill up on AI safety/governance * Cosmos Institute [https://cosmos-institute.org/] and FIRE [https://www.thefire.org/] * bianjie.systems [https://bianjie.systems/] — the art platform Seb is co-organizing a dinner with in NY (Seb’s announcement [https://x.com/sebkrier/status/2054941198406602861]) * Drexciya [https://en.wikipedia.org/wiki/Drexciya] — James Stinson, Gerald Donald, and the Detroit electro-afrofuturism canon Timestamps (00:00) Intro (01:16) What is AGI? (07:30) In defense of scaffolding — Hayek, division of labor, and why one giant model won’t do it (13:00) Markets for cognition: will agents bid in procurement auctions? (18:40) Recursive self-improvement — separating the model side from the societal side (24:44) Alignment has gone better than 2017-Seb expected; prefer “intent following” (31:14) What economists should actually work on to inform AI labs(33:32) What does a DeepMind policy lead’s day look like? (38:20) AI Conferences(41:52) Coasean bargaining at scale — the positive vision(55:00) Inequality, property rights, and who gets the initial allocation (01:03:00) The Helldivers 2 “Managed Democracy” dystopia as Coasean bargaining gone wrong (01:09:00) Sponsor: Revelio Labs (01:09:30) Lightning round Justified Posteriors is a reader-supported publication. To receive new posts and support our work, consider becoming a free or paid subscriber. You’re also invited to our discord community at: https://discord.gg/b8VpPbBUt Transcript 00:00:00,100 --> 00:00:20,480 [Seth] [upbeat music] Welcome to the Justified Posterior’s podcast, the podcast that updates beliefs about the economics of AI and technology. I’m Seth Benzell, the number two biggest fan, after Tyler Cowen, in the Seb Krier fan club. 00:00:20,480 --> 00:00:20,740 [Andrey] [laughs] 00:00:20,740 --> 00:00:24,660 [Seth] Coming to you from Chapman University in sunny southern California. 00:00:24,660 --> 00:00:34,120 [Andrey] And I’m Andrey Fradkin, coming to you from San Francisco, California. And Justified Posterior’s is sponsored by the fine folks at Revelio Labs. 00:00:35,560 --> 00:00:45,600 [Andrey] We’re very excited to have Seb Krier here with us today. He is the policy lead for AGI at Google DeepMind, and is, 00:00:46,840 --> 00:00:52,400 [Andrey] dare I say, a thought leader in this space. Welcome to the show, Seb. 00:00:52,400 --> 00:00:54,200 [Seb Krier] Thank you very much. It’s great to be here. 00:00:55,380 --> 00:00:58,160 [Seb Krier] Yeah, I’m Seb, calling in from New York. 00:00:58,160 --> 00:01:00,320 [Andrey] And we should remind our listeners that 00:01:01,340 --> 00:01:08,410 [Andrey] Seb is, during this podcast, expressing his personal opinions, and is not speaking on behalf of DeepMind. All right. 00:01:08,410 --> 00:01:09,740 [Seb Krier] Indeed. [laughs] 00:01:09,740 --> 00:01:11,060 [Andrey] [laughs] 00:01:12,780 --> 00:01:13,900 [Andrey] The usual caveat. 00:01:15,260 --> 00:01:16,760 [Andrey] Seb, what is AGI? 00:01:18,080 --> 00:01:19,450 [Seb Krier] What is AGI? [laughs] 00:01:19,450 --> 00:01:19,570 [Andrey] [laughs] 00:01:19,570 --> 00:01:19,580 [Seth] [laughs] 00:01:19,580 --> 00:01:19,780 [Seb Krier] Great question. 00:01:19,780 --> 00:01:21,900 [Andrey] We’re going to start with the big questions. 00:01:21,900 --> 00:01:22,880 [Seb Krier] Yeah, might as well. 00:01:24,259 --> 00:01:54,840 [Seb Krier] [sighs] I think there’s so many definitions out there of what AGI is, and I think most of them are kind of unsatisfactory in one way or another. I’ve seen stuff like many definitions are indexed on the societal transformations or economic impacts of the technology, which I don’t really like very much because it makes it very dependent on external factors whether or not we have AGI. If it’s banned, we don’t have AGI, and if it’s not banned, we have AGI. Is it? 00:01:54,840 --> 00:01:55,480 [Andrey] [laughs] 00:01:55,480 --> 00:02:04,670 [Seb Krier] And there are other tests, like if an AI makes $1 million or something, which I find is very weird because most humans do not make $1 million in the first place. 00:02:04,670 --> 00:02:05,080 [Andrey] [laughs] 00:02:05,080 --> 00:02:11,359 [Seb Krier] So the one I kind of like is actually Shane Legg’s definition- 00:02:11,360 --> 00:02:11,620 [Andrey] Mm 00:02:11,620 --> 00:02:12,420 [Seb Krier] ... who’s at Deep Mind, who is 00:02:13,640 --> 00:02:16,980 [Seb Krier] more of a capability-based definition, which is something along the lines of 00:02:18,420 --> 00:02:20,960 [Seb Krier] an AI or a system that does most 00:02:22,380 --> 00:02:30,360 [Seb Krier] standard cognitive tasks that people typically do. [lips smack] So it’s kind of the bar isn’t too low, and it’s also not too high either. 00:02:32,220 --> 00:02:35,480 [Seb Krier] And so I think he’s got this definition of a minimal AGI, 00:02:36,580 --> 00:02:43,020 [Seb Krier] and I think that we’re not exactly there yet. I would disagree with people saying that we have AGI today because I think 00:02:44,220 --> 00:02:48,900 [Seb Krier] a lot of the systems we have, there’s many things that a human can do that they don’t really do very well. 00:02:48,900 --> 00:02:50,360 [Seth] What’s the biggest gap that we’re missing? 00:02:52,020 --> 00:03:47,740 [Seb Krier] I’d say there’s a few. One of them might be continual learning, or at least the ability to adapt and learn over time, and in different contexts and situations, just kind of update your own world model or whatever. If I think of a new joiner in a company, they’re not super useful the first day, but their value goes up over time because they learn all sorts of things. And so [lips smack] that might be one of them. A lot of the systems we have today, I think, are not very good at software, and you’re using graphical user interfaces and software and whatnot. If I ask an agent right now to go and use a music production software and make a track, I think they’d generally struggle. That doesn’t mean it’s impossible to solve or anything like that, but I think, in many respects, they’re not as general as you’d want them to be. And then the other bit also is, [lips smack] and of course they still make some silly mistakes here and there, but I think that’s getting it fixed. But the creativity point is one that I’m really interested in as well, in that I think they’re really good at kind of 00:03:48,780 --> 00:04:02,700 [Seb Krier] exploiting maybe an existing paradigm or an existing knowledge and so on, and recombining knowledge and whatnot. But I think really coming up with new concepts and abstractions entirely is something I think humans can do, but I don’t see our current systems really doing either. 00:04:02,700 --> 00:04:10,060 [Andrey] How do you measure whether humans can do creative tasks? One of the things that 00:04:11,200 --> 00:04:15,940 [Andrey] strikes me as a bit of an unfair test in that, 00:04:17,060 --> 00:04:23,290 [Andrey] let’s say you ask an LLM to write a poem or to write a story. It’s very- 00:04:23,290 --> 00:04:23,290 [Seth] [laughs] 00:04:23,290 --> 00:04:32,050 [Andrey] ... times more entertaining than what a random human would write. So, do you have a benchmark for creativity? 00:04:32,050 --> 00:04:35,390 [Seth] This is the meme where the robot asks Will Smith if he can compose an opera. 00:04:35,390 --> 00:05:14,700 [Seb Krier] [laughs] Can you? Yeah, exactly. It depends, and you’re right. Obviously, most people aren’t creating new abstraction and concepts on a day-to-day level. But I imagine there’s still something qualitative about that kind of creativity that I think does get applied in everyone’s day-to-day life in various kind of ways. Maybe they’re not as big or significant as creating a symphony. But I don’t really have a strong test. There’s actually an interesting podcast that had Ben Goertzel and Yoshua, I think a few years ago, where they were saying something like, if you had a model that was trained knowing only classical music and West African drumming, could it come up with jazz in the first place, or recreate jazz? 00:05:16,460 --> 00:05:27,880 [Seb Krier] And I quite like that test. And in principle, I can imagine it being possible. You could kind of decompose all sorts of different kind of elements and variables here and just get something jazz-like. But it still feels a bit... 00:05:29,580 --> 00:05:40,580 [Seb Krier] It’s not the same as just coming up with the idea of jazz in the first place and saying, oh, I’m going to try these things out. And for whatever reason, I’m going to stick to that. And I don’t know. It’s- 00:05:40,580 --> 00:05:53,190 [Seth] Recombination versus paradigm shifting. I’ve also heard one test people would want for AGI is, can you train the model on the 1900s corpus and it comes up with Einsteinian physics? 00:05:53,190 --> 00:05:53,200 [Seb Krier] Yeah. 00:05:53,200 --> 00:05:54,720 [Seth] That would be really impressive. 00:05:54,720 --> 00:06:36,151 [Seb Krier] Yeah, I think actually Demis uses that test sometimes, or I think Pele Gritzer as well mentioned it before. And there are some people, I think David Duvenour and Nick Levine, I think, had this recent kind of language model talky that was trained up in, I think, the 1930s or something. And I tried to play around with it a lot. It was like, let’s try to get it to create something new, and it’s pretty tricky. Although they have apparently recently, some people kind of fine-tuned it on a very few examples of coding and gotten it to be good at coding. But for some reason, that doesn’t impress me maybe as much as other things I would’ve expected. It’s like [laughs] there’s the-I agree that the goalposts also kind of move a little bit over time, and it’s also maybe unfair of me. It’s like, oh, well, can it create a new programming language from scratch or something? 00:06:37,272 --> 00:06:43,052 [Seb Krier] So it’s a tricky one to kind of square off, but it does still feel like there’s a lack of that kind of true creativity, at least in my 00:06:44,212 --> 00:06:45,072 [Seb Krier] interactions with them. 00:06:46,392 --> 00:06:57,342 [Andrey] I am really worried that it is a goalpost moving exercise here. We don’t have a benchmark for creativity and therefore, 00:06:58,432 --> 00:07:03,211 [Andrey] all these claims are not quantitative in a way that I’d like. And let- 00:07:03,212 --> 00:07:10,612 [Seth] Right. What about all those IS papers we see where one of the axes is creativity and we instrument for something? [laughs] 00:07:10,612 --> 00:07:11,032 [Andrey] Yes. 00:07:13,132 --> 00:07:13,592 [Seth] There’s a lot of bad measures of creativity. 00:07:13,592 --> 00:07:19,762 [Andrey] Those are not creative, to be clear. I’m sure I’ve offended a ton of people. Sorry. 00:07:19,762 --> 00:07:20,992 [Seth] It’s okay. 00:07:20,992 --> 00:07:56,432 [Seb Krier] I think it’s fair. I agree that it’s a bit like... But I still feel like there’s, at least if part of the reason you’re going to create these systems is to come up with kind of also new sorts of theories and so on. And I think you can probably get that through good search and a lot of inference compute and trying out lots of different things. And I think there are many low-hanging fruits there, to be clear. So it’s not like I think, oh, we’ve hit some sort of wall or something. And I think there’s a lot that you can kind of get in terms of new knowledge and new creative knowledge from that. But I feel like there’s maybe something more needed. It’s maybe not that kind of magical or anything, right? Maybe you just need better scaffolding or better multi-agent systems. But 00:07:58,992 --> 00:08:02,072 [Seb Krier] yeah, at least so far, I would say that I see a bit more creativity, say, in 00:08:03,652 --> 00:08:11,612 [Seb Krier] humans so far as a collective. And maybe that’s, again, an unfair comparison. You don’t have a culture of AIs and AGIs to compare that against. So- 00:08:11,612 --> 00:08:11,682 [Andrey] Yeah 00:08:11,682 --> 00:08:15,092 [Seb Krier] ... the right comparison is also a hard one to do. 00:08:15,092 --> 00:08:52,772 [Andrey] So, you mentioned scaffolding, and I guess a question, you recently wrote about a defense of scaffolding, and I think just to frame things, some people you talk with, especially very AGI-pilled people, are like, “Scaffolding, it’s an epiphenomenon. It doesn’t matter. In the end, we are going to train a smarter model with more parameters and more training data, and it’s just going to do it out of the box. And so all these scaffolding hacks are just very temporary.” And then other people like yourself, I guess, argue the opposite. So what do you think about scaffolding? 00:08:54,832 --> 00:08:55,052 [Seb Krier] Yeah. 00:08:56,572 --> 00:08:59,372 [Seb Krier] The first thing is I’m definitely not sure. This is kind of 00:09:00,532 --> 00:09:39,672 [Seb Krier] one of many hot takes, but I think, I guess there are a few reasons why I see it as, I think it’s going to stay over time. The first is that I think it’s plausible that as, I think scaling laws continue, I think you scale models and they get better over time and so on, but I think the inputs are expensive and grow over time. And I also think that it’s plausible that you might get more and more diminishing returns over time. And if that’s the case, I see the kind of utility of the scaffolding side and the harnesses as going up because you’re going to want to make more, you’ll want more bang for your buck kind of thing. You’re going to want to extract this intelligence and use this resource as efficiently as possible. 00:09:40,772 --> 00:09:51,532 [Seb Krier] So that’s maybe one reason. The other one is a bit more, I guess, Hayekian in nature or something, in that I see a lot of, I think there’s a lot of local knowledge, a lot of 00:09:53,212 --> 00:10:18,592 [Seb Krier] stuff that isn’t necessarily kind of codified. And I don’t really see one big giant AGI model now kind of perfectly guessing everything forever at infinite scales. And in a way, I see this as a little bit like a division of labor in that I think it’s actually more efficient to have this kind of integration layer that is closer to the local information or to the ground or to demand side that can better integrate this kind of cognitive resource 00:10:19,812 --> 00:10:23,632 [Seb Krier] to satisfy and create value and satisfy whatever consumers and businesses want. 00:10:25,552 --> 00:10:31,352 [Seb Krier] So to help with all the sorts of constraints and the context they’re dealing with, I think it’s very useful to have that. 00:10:33,712 --> 00:10:39,112 [Seb Krier] Of course, I don’t think this necessarily also implies or means that you’re going to get complete, full decentralization or something. 00:10:40,772 --> 00:10:42,212 [Seb Krier] Walmart gets huge 00:10:43,872 --> 00:10:48,872 [Seb Krier] returns from the scale that they have, and you don’t have loads of businesses downstream kind of reselling their stuff. 00:10:51,252 --> 00:10:53,932 [Seb Krier] But there’s two things. The first is that- 00:10:53,932 --> 00:10:56,812 [Seth] We have bodegas reselling stuff from Walmart on the corner. 00:10:56,812 --> 00:11:18,992 [Seb Krier] Actually, that’s a good point, yeah. And also, there are all sorts of other businesses kind of selling different things, right? If the task is generic and the demand is homogenous, then sure, maybe you can do more of that. But also, even Walmart relies on all sorts of kind of suppliers, local labor, compliance system, inventory systems, third parties, and whatnot, that help with this kind of integration and the delivery of these services. 00:11:18,992 --> 00:11:25,862 [Seth] So if I may summarize your answer, you’re very Hayek-pilled, but maybe not as Bitterlesson-pilled as most. 00:11:25,862 --> 00:11:25,972 [Seb Krier] Well, 00:11:27,212 --> 00:11:31,052 [Seb Krier] I think I’m definitely Bitterlesson-pilled in the sense that I don’t think you should 00:11:33,652 --> 00:11:48,992 [Seb Krier] try to kind of cement some sort of rules-based system you either devise or something and kind of hope that this just takes forever. If anything, I think the scaffold needs to be a lot more adaptive and evolve over time. In the same way as if you have a small startup and they have all sorts of kind of rules and, 00:11:50,332 --> 00:12:02,772 [Seb Krier] sorry, not rules, different functions. When the startup grows and gets more capabilities, they also kind of change from the inside. So I think that, of course, if you have some sort of light GPT-type wrapper that kind of makes your system a little bit better, whatever, yeah, that was not going to 00:12:03,812 --> 00:12:23,652 [Seb Krier] work out over time. But I think there are kind of scaffolds that help better integrate the wider environment, private data, deals with permissions or liability regimes or user preferences and whatnot. And also, at a somewhat higher level, kind of more coordination-type scaffolds maybe in terms of market interfaces, like clearing house equivalents or something. 00:12:24,516 --> 00:12:33,536 [Seth] The third example you gave is maybe it’s not the super frontier model that are going to these scaffolds, but simpler models that are still very useful and cheaper to run with a scaffold. 00:12:33,536 --> 00:12:46,176 [Seb Krier] Yeah, totally. Because I think you’re not going to need the enormous, super expensive brain for every single random task. And so it’ll make, for most kind of basic queries, people aren’t using Opus’s latent space or something as- 00:12:46,176 --> 00:12:46,186 [Seth] [laughing] 00:12:46,186 --> 00:12:48,236 [Seb Krier] ... it’s a big waste in some sense. 00:12:48,236 --> 00:12:50,036 [Seth] What toothbrush should I buy? [chuckles] 00:12:50,036 --> 00:12:51,196 [Seb Krier] Yeah. Exactly. 00:12:51,196 --> 00:12:53,896 [Andrey] Wait. That is an important question, Seth. 00:12:53,896 --> 00:12:54,516 [Seb Krier] I mean- 00:12:54,516 --> 00:12:56,536 [Andrey] I would definitely use Opus for that. 00:12:56,536 --> 00:12:57,385 [Seb Krier] It’s funny because I’ve actually- 00:12:57,385 --> 00:12:59,696 [Seth] Use all the collective intelligence of reality. [chuckles] 00:12:59,696 --> 00:13:02,266 [Seb Krier] I have actually used Opus for that exact question not long ago- 00:13:02,266 --> 00:13:02,626 [Seth] [laughing] 00:13:02,626 --> 00:13:06,256 [Seb Krier] ... in trying out this new electric toothbrush that I found out as a result. But, 00:13:07,636 --> 00:13:22,076 [Seb Krier] so yeah, I agree there’s that and also there’s all sorts of ways in which actually kind of using tools or specialized kind of tools is just more effective and more efficient. Why would you expect a large model or something to kind of calculate things innately or something when you can just access a calculator? It’s a much better use of tokens. 00:13:22,076 --> 00:13:36,856 [Andrey] But it should kind of know that the calculator is available and then use it when it’s there. So that’s the argument against scaffolding, or you’re giving it a general environment, but you’re not scaffolding it much. I think a curious thing is just, 00:13:38,376 --> 00:13:40,356 [Andrey] it seems like most people who are using 00:13:41,416 --> 00:13:49,156 [Andrey] scaffolded agents today are using them with essentially one of two scaffolds, with Cloud Code or Codex. And 00:13:50,236 --> 00:14:00,475 [Andrey] those seem to be good enough maybe. I guess, do we see a lot of people customizing, a lot of people, whatever, companies customizing their scaffolds? 00:14:00,476 --> 00:14:03,856 [Seth] CladBot, do the CladBots count as that, I guess? 00:14:03,856 --> 00:14:04,236 [Andrey] Yeah. 00:14:05,396 --> 00:14:39,676 [Seb Krier] They are a form of it. I don’t know. I think a lot of power users and people in our immediate communities use a lot of Cloud Code and Codex, and particularly software engineers. But I don’t think most legal departments and most kind of firms out there are necessarily using Cloud Code either. And it’s not clear to me that this is necessarily the optimal interface or, there may be better systems that are Cloud Code-like, or CLI-like perhaps in some way. But, so I don’t know, maybe they’re sufficient, but even these tools end up kind of calling on loads of other external APIs and tools and so on in how they 00:14:40,836 --> 00:14:57,576 [Seb Krier] function. So if anything, these are actually scaffolds. You’re not kind of calling the model directly. There’s all sorts of different sub-agents behind the scenes. It’s not just a one-shot call. There’s quite a lot going on, which is in fact this more, I don’t know, dynamic scaffolding thing I was mentioning earlier, I guess. 00:14:58,976 --> 00:15:06,736 [Andrey] Okay. The natural question here is, what is going to be the role of the market in coordinating- 00:15:06,736 --> 00:15:07,375 [Seb Krier] Mm 00:15:07,375 --> 00:15:11,276 [Andrey] ... AI here? And I’ll just very shamelessly plug- 00:15:11,276 --> 00:15:11,285 [Seb Krier] [chuckles] 00:15:11,285 --> 00:15:24,796 [Andrey] ... some recent work with Rohit Krishnan, where we’re kind of playing around with the idea of LLMs bidding in a procurement auction and seeing whether that results in more efficient use of AI. 00:15:26,696 --> 00:15:29,655 [Seb Krier] Well, first of all, I need to properly read that again. But the- 00:15:29,655 --> 00:15:30,476 [Andrey] [laughing] 00:15:30,476 --> 00:15:31,016 [Seb Krier] In terms of, 00:15:32,496 --> 00:15:32,916 [Seb Krier] I guess, 00:15:34,556 --> 00:15:46,396 [Seb Krier] at a very high level, markets are good at just coordinating in general, including AI. And so, assuming they function as intended in it, you’ve got the pricing mechanism to get... 00:15:47,556 --> 00:15:49,396 [Seb Krier] I don’t know. I expect that to kind of work as well with 00:15:50,476 --> 00:15:52,616 [Seb Krier] matching, I guess, supply and demand or something. 00:15:54,016 --> 00:15:55,196 [Seb Krier] The supply of this 00:15:56,216 --> 00:16:00,036 [Seb Krier] raw resource of cognition or something, and the demand of all sorts of different businesses and users. 00:16:01,696 --> 00:16:05,516 [Seb Krier] So maybe, at a very high level, I don’t know. What exactly do you mean by the role of the market or something here? 00:16:09,076 --> 00:16:21,356 [Andrey] Obviously the market is involved in many parts of the AI vertical supply chain, right? From competition in chips. There’s competition between models. There might be also competition between 00:16:22,516 --> 00:16:28,576 [Andrey] scaffolds, bundles of environments, scaffolds, and LLMs. 00:16:28,576 --> 00:17:06,496 [Seth] I guess maybe it would be useful to juxtapose this versus, so what Andrey, one of the things he’s imagining is, I have a job. I post it to some sort of Upwork-like future platform. Different companies that host different AI models bid to do that job. “Oh, I think I can do that job with $1 of electricity and tokens,” versus another model, and then we get efficient allocation of intellectual tasks to models, right? So do we think that that’s going to be important, or is it going to be more like I ask the super model what the best model is, and I just get allocated in a non-market way? Might be one version of this question. 00:17:08,156 --> 00:17:18,836 [Seb Krier] I guess intuitively, my mind goes to the former question. But, or there’s a little bit of both in some sense, because even in the former one, you’re going to be using the large model for some sort of 00:17:20,436 --> 00:17:26,686 [Seb Krier] cognitively demanding task or something. It kind of depends what kind of quality of output you also need and want. 00:17:26,686 --> 00:17:26,706 [Seth] [chuckles] 00:17:26,706 --> 00:17:27,056 [Seb Krier] But then 00:17:28,376 --> 00:17:49,636 [Seb Krier] you’re still going to be constrained by your own resources or something, and depending on what you have to spend, if you can get the output for cheaper by kind of relying on this kind of competitive marketplace of smaller models or something, not even smaller models, they might just be all be big and kind of just scaffolding different, you’re offering a slightly different thing. Why wouldn’t you go for that, and why wouldn’t that exist in the first place? Unless the very first- 00:17:49,636 --> 00:17:52,216 [Andrey] Doesn’t exist yet, just to be clear. 00:17:52,216 --> 00:17:52,716 [Seb Krier] Um- 00:17:52,716 --> 00:17:58,416 [Seth] A, it doesn’t exist yet, and as Andrey proves, at least current models are bad at understanding their own capabilities. 00:17:58,416 --> 00:17:58,666 [Andrey] Oh, yeah. 00:17:58,666 --> 00:18:00,496 [Seth] Now maybe that’s going to be fixed. 00:18:00,496 --> 00:18:08,096 [Seb Krier] Yeah. Oh, no, I agree. I think that we’re not there yet, right? I think, again, and that goes back to the earlier AGI question, is there’s all sorts of, then again, what’s the right comparator? But, 00:18:09,476 --> 00:18:21,316 [Seb Krier] yeah, I don’t think we’re exactly there. Yeah, I think a lot of this will have to be built as well. The kind of an ability for a model to just better kind of operate in a more multi-agent environment, kind of have a better sense of 00:18:22,596 --> 00:18:32,556 [Seb Krier] delegation. I think the kind of, yeah, industrial intelligence or something seems to be maybe more neglected, as opposed to just single-agent intelligence or something, if that makes sense. 00:18:32,556 --> 00:18:34,776 [Seth] Do we need to bring the word cybernetics back? 00:18:34,776 --> 00:18:35,496 [Seb Krier] Yeah. 00:18:35,496 --> 00:18:36,116 [Andrey] [laughs] 00:18:36,116 --> 00:18:38,816 [Seb Krier] Somewhat. [laughs] 00:18:40,756 --> 00:18:51,256 [Andrey] All right. A little change in subject, but I know this has been in the discourse, the topic of recursive self-improvement, RSI. 00:18:51,256 --> 00:18:52,956 [Seth] Ooh, very scary. 00:18:52,956 --> 00:18:54,896 [Andrey] Jack Clark recently had an essay about it. 00:18:56,376 --> 00:18:58,876 [Andrey] Seb, what is your take? 00:18:58,876 --> 00:18:59,206 [Seb Krier] [chuckles] 00:19:00,316 --> 00:19:07,896 [Seb Krier] What is my take? I don’t know. I think it depends what exactly we mean by recursive self-improvement. 00:19:09,096 --> 00:19:50,336 [Seb Krier] I had a blog post not long ago, I guess, when trying to disentangle a little bit what I have in mind when I think about this. On the one hand, there’s the model getting recursively better through the usage of more AI and whatnot. And on the other hand, there’s the more kind of societal side of things, the transformation side, which I think very often, these two worlds are a little bit blurred in the discourse. It’s like, oh, you get RSI, and then X, Y, Z about the world or something. Things go really fast or they don’t go fast. And, I think these should be separated very neatly because on the model side, of course, I expect, already there’s a lot of AI being used everywhere to kind of create models. And I expect that to continue. 00:19:52,536 --> 00:19:55,976 [Seb Krier] But it’s not clear to me that this necessarily now leads to a dynamic by which 00:19:57,156 --> 00:20:16,596 [Seb Krier] the model now gets extremely or exponentially intelligent in a very short amount of time. It’s still kind of bottlenecked by all sorts of resources. And as I was saying earlier, I still see them as better at kind of paradigm exploitation than kind of exploration, which I think is the thing you might need to get to the next step. But, first of all, what do I know? But secondly, 00:20:17,616 --> 00:20:19,986 [Seb Krier] the other thing is, yeah, on the societal side of things, 00:20:20,996 --> 00:20:29,756 [Seb Krier] people sometimes talk about foom or hard takeoffs and whatnot, and these have very clear kind of real-life implications. It’s not just kind of a model of getting better in a 00:20:31,216 --> 00:20:34,576 [Seb Krier] data center somewhere. And that side, I think, is where you have to think about 00:20:36,116 --> 00:21:27,056 [Seb Krier] [lip smack] all the kind of usual bottlenecks, adoption, deployment, diffusion, the kind of productive integration of all these systems at scale, both in terms of manufacturing and so on and so forth. And, I guess it’s not clear to me that the shift from GPT-2 to GPT-3 or coming up with kind of, we’re just very classic kind of software engineering, meat and potatoes type tasks that you can just easily just automate away. It’s maybe one of these things that’s maybe easy to say ex post, but, I’m not sure. And certainly, my expectation is you’re going to get loads of gains in the coming years of kind of automating part of that pipeline. But that seems good. You just get better models, and that’s just overall helpful for all sorts of other things, even if you’re doing safety work and kind of governance work and whatnot, we benefit a lot from that cognitive resource, I guess. 00:21:27,056 --> 00:21:40,696 [Andrey] What would happen in the world for you to change your mind? Is there any, let’s say that recursive self-improvement is actually kind of this much more profound change than you’re painting. 00:21:41,816 --> 00:21:42,036 [Andrey] What 00:21:44,136 --> 00:21:45,696 [Andrey] signs would there be, I guess? Yeah. 00:21:45,696 --> 00:21:51,656 [Seb Krier] But to be clear, I’m not claiming it’s just business as usual, nothing to see here or whatever, right? I’m 00:21:52,796 --> 00:22:14,936 [Seb Krier] kind of just claiming that some of the stronger versions of the claim aren’t kind of self-evident. And so I see a lot of this happening in some sense. Certainly, in 10 years, I expect to have larger kind of more, again, acceleration of economic growth and whatnot and kind of faster diffusion across the board. I certainly don’t expect diffusion to take the same amount of time as, say, electricity or these other technologies. 00:22:16,576 --> 00:22:23,236 [Seb Krier] So it depends what exactly you mean, because what specifically am I looking to change my mind on? 00:22:23,296 --> 00:22:30,656 [Andrey] Well, let’s say the scenarios of AI 2027, right? Presumably, 00:22:31,996 --> 00:22:45,176 [Andrey] in 2027, you’ll see something that’s like, “Oh, wow, I was wrong. This is not going to be so gradual. This is going to be this sudden foom,” that you’re criticizing. Yeah. 00:22:45,176 --> 00:22:52,236 [Seb Krier] The original foom or hard takeoff definition literally talks about this change happening within hours or days. 00:22:52,236 --> 00:22:53,236 [Andrey] [chuckles] 00:22:53,236 --> 00:22:56,056 [Seb Krier] Which is not even, it’s not what the 2027 scenario, I think, predicts. 00:22:56,056 --> 00:22:56,296 [Andrey] Yes. 00:22:57,556 --> 00:23:00,446 [Seb Krier] But the 2027 scenario, from what I remember, again, it’s been a bit of time now. 00:23:01,796 --> 00:23:08,816 [Seb Krier] One thing with the scenarios there is that there’s the kind of misalignment assumption, and which I’m kind of uncertain about. 00:23:08,816 --> 00:23:09,255 [Andrey] Mm. 00:23:09,256 --> 00:23:17,296 [Seb Krier] And it also talks about a lot of progress in robotics, which I think is a bit further away. I think it’s close. We’re getting there, too. 00:23:19,116 --> 00:23:19,476 [Seb Krier] But 00:23:21,156 --> 00:23:25,916 [Seb Krier] I don’t know. Probably kind of AI, if in 2030, we start seeing AI is making all sorts of crazy 00:23:26,956 --> 00:24:06,196 [Seb Krier] inventions, innovations in fields other than just kind of perhaps math and coding across the boards, and I’m like, okay, this is clearly-- And you get extremely fast adoption, too, right? You have entire businesses doing completely, it’s not business as usual, clearly, in the economy or something and wide adoption. But it’s hard to say because I expect all that to some degree, right? It’s not that I’m saying, “Oh, this is never going to happen.” I just think of it as a little bit more elongated and the implications of that being maybe not as like, we have Dyson spheres in five years or something like that, so. It’s more of a disagreement maybe on the extremes or the margins or something, but not so much at the core of the claim that yes, models are going to make models better and... 00:24:07,276 --> 00:24:27,536 [Seb Krier] But, again, even having-- In fact, actually, here would be a thing. If Anthropic or DeepMind or something in 2037 have fewer and fewer employees, fewer people kind of just doing AI research, engineers and so on, you’re clearly seeing kind of that profession. Because of course, I can imagine these jobs to change, right? Maybe you’re kind of managing more agents or something. That 00:24:28,616 --> 00:24:35,966 [Seb Krier] I expect. But the fact that you just need far fewer people to kind of do not only these large training runs, but the kind of 00:24:36,976 --> 00:24:43,476 [Seb Krier] large training runs that give you just much, much better systems, then I think I’d be like, okay, this is going a little bit faster than maybe expected or something. 00:24:44,656 --> 00:24:51,676 [Andrey] Okay. One thing you mentioned in that kind of hints at another hot take you have, which is about alignment. 00:24:51,676 --> 00:24:52,026 [Seb Krier] Uh-huh. 00:24:54,596 --> 00:24:55,926 [Andrey] What’s the deal with alignment? 00:24:57,196 --> 00:24:58,086 [Andrey] [laughs] 00:24:58,086 --> 00:24:58,136 [Seb Krier] [laughs] 00:24:58,136 --> 00:25:02,136 [Seth] Is it hard? Is it easy? Is it different than we would’ve expected going in? 00:25:02,136 --> 00:25:19,646 [Seb Krier] Yeah. It’s perhaps that. I think my take about alignment is something-- Well, first of all, I just don’t like the word. I think it’s a bit of an annoying word because it’s being used for all sorts of things. The AI says something that we just kind of don’t like, or you say, “Oh, it’s misaligned.” No one pre-registers what they expect the aligned behavior to be, and then just kind of tests. 00:25:19,646 --> 00:25:20,116 [Andrey] [laughs] 00:25:20,116 --> 00:25:35,626 [Seb Krier] But I think my general claim is maybe the fact that it’s been easier than we would’ve predicted a decade ago or so. Then when I first got into AI in 2017, that was partly as a result of reading things like “Superintelligence” by Bostrom. 00:25:35,626 --> 00:25:36,236 [Andrey] Mm-hmm. 00:25:36,236 --> 00:25:48,496 [Seb Krier] And you’d read these books, like Stuart Russell’s “Human Compatible” and others, that kind of had all these analogies like King Midas and you ask a system to optimize for goal X, and in pursuit of that goal, it does all sorts of other things that you don’t want it to do. 00:25:48,496 --> 00:25:51,916 [Seth] Right. The paperclip maximizer, and we seem to not have those. 00:25:51,916 --> 00:25:57,476 [Seb Krier] Yeah. It’s like one version of it or one variant of it. And certainly at the time you didn’t really have language models. A lot of these intuitions were kind of based off 00:25:58,596 --> 00:26:48,236 [Seb Krier] reinforcement learning systems in very basic kind of game scenarios where they were actually given a single goal to optimize for. And this is not actually what we do, I think, with models. And you had these kind of examples, even the value loading problem was something discussed at the time where actually specifying these complicated nuanced human values in mathematical terms would be extremely hard. So even if you managed to tell a robot to clean the room, it would then just pick up a baby and put it in the trash or something. And I think it turns out a lot of this stuff is actually much easier. You have problems. You’ve got things like reward hacking. You’ve got AIs behaving in weird ways that we were not always kind of anticipating because of the ways they were post-trained. So my claim is not like, oh, again, it’s all fine, and safety is a scam or whatever. It’s more that it’s certainly much easier than, or at least we’re in a much better track than I would’ve at least guessed perhaps a decade ago. And secondly, I think it 00:26:49,916 --> 00:26:54,816 [Seb Krier] just seems tractable. There’s a lot of progress in terms of chain-of-thought monitoring and all these other things. And 00:26:56,696 --> 00:26:57,796 [Seb Krier] I also think that the 00:26:59,016 --> 00:27:05,825 [Seb Krier] hard part is maybe more the kind of normative question of whose values and when, and what and everything. That’s the kind of thing that we’re looking into more. But 00:27:07,096 --> 00:27:13,696 [Seb Krier] yeah, I prefer the word actually instruction following or intent following or something instead of alignment. And I think by and large, they’re actually pretty good at that. 00:27:14,796 --> 00:27:31,636 [Seb Krier] So again, that doesn’t mean you have to dismiss all sorts of theories and all the kind of power optimization stuff. But I guess my immediate outcome is this goes rather well. Or if I am more concerned by other things like misuse, if you’d like, than kind of the AI’s being innately, inherently kind of internally misaligned. 00:27:31,636 --> 00:28:03,676 [Seth] This really seems related to your take that intelligence is not at odds with being a tool, right? So a lot of people have this intuition where if you had a super-duper intelligent genie or oracle, it would develop even implicitly some sort of value or goal that orthogonality thesis might have nothing to do with what we want. But you’re more optimistic about the idea that the LLM doesn’t want anything. It’s incorrect to take the intentional stance towards an LLM. 00:28:03,676 --> 00:28:09,236 [Seb Krier] Not incorrect. It’s actually kind of descriptively useful, even functionally sometimes to use that language. 00:28:10,796 --> 00:28:18,836 [Seb Krier] But that’s the thing, right? I think we kind of lack the language to properly delineate and differentiate when it’s useful to use that or appropriately descriptive and when it’s not. 00:28:20,076 --> 00:28:41,496 [Seb Krier] And so I agree that, of course, I think the take I had on this was something like, and I can imagine a tool being an agent and an agent being a tool. Or in principle, I can imagine something being hyper-capable and still being broadly instruction following rather than at a certain level of capability, aha, that’s when the goals change and things get... And it kind of depends on the type of system as well. I imagine not all 00:28:42,656 --> 00:28:45,116 [Seb Krier] paths lead to the same kind of outcome. But, 00:28:46,256 --> 00:29:13,596 [Seb Krier] so again, I can see plausible versions of the world where homo hundrio drives or something are a more salient feature of the way we kind of train models. Right now, it doesn’t seem to me very likely that this is a core feature that they have. But of course, it’s hard to kind of either prove or disprove, right? Because someone might just say, well, that’s because they’re very good at hiding this or something, or once they’re capable enough or whatever. So there’s always a bit of this kind of gotcha thing. It’s like deception. But 00:29:14,936 --> 00:29:39,896 [Seb Krier] yeah. So in principle, I guess I can totally conceive of at least a superintelligence that is controllable, that is benign, that is at least subservient to the goals of humanity or a user or principle or whatever. That could still be used to cause enormous harm, but it’s just I don’t necessarily think the analogies of, oh, I think Tegmark was thinking, look at the zoo where the monkey’s going. I think these are just not really 00:29:41,736 --> 00:29:43,136 [Seb Krier] helpful kind of analogies. 00:29:44,276 --> 00:30:02,396 [Seth] Monkey at the zoo, but you’ve also got the monkey’s paw, right? Maybe the reason some prefer alignment to instruction following is we all know the story of, be careful what you wish for. You wish for something, and it’s under-specified, and you get the bad version of it because the AI doesn’t understand the context. 00:30:02,396 --> 00:30:08,336 [Seb Krier] I think that’s why, yeah, I think maybe instruction following is maybe too... Intent following or something gets to it more. 00:30:09,936 --> 00:30:18,316 [Seb Krier] But of course, that problem doesn’t go, even if it follows intent or something, you could still have all the problems because your intent is nefarious or whatever. So 00:30:19,436 --> 00:30:19,816 [Seb Krier] I think the 00:30:21,356 --> 00:31:06,756 [Seb Krier] way you deal with that is all sorts of, I don’t know how to conceptualize it, but in fact scaffolds. It’s a bit more this outside of the model or something. I’m kind of almost indexing on a world that will indeed have agents that are trained to be bad or whatever, or someone going to be instructed to do bad things. But just like with humans, you come up with all sorts of kind of systems, rules, laws, norms, kind of protocols that either discourage the kind of bad behavior, or punishes it, or makes it just not worthwhile or something. But I’m not going to put all my bets on the, oh, it has to be pure-hearted, and that will be sufficient. And then you just scale it forever, and it’s going to be an amazing goal. I just think that the way of seeing or thinking about AI is that I just find kind of a bit 00:31:08,096 --> 00:31:12,656 [Seb Krier] too narrow, I guess. I think it’s important, it’s just insufficient, and it’s certainly not my main kind of a-- yeah. 00:31:14,946 --> 00:31:15,206 [Andrey] Okay. 00:31:16,666 --> 00:31:20,086 [Andrey] Our audience is very much composed of economists. 00:31:22,586 --> 00:31:30,506 [Andrey] If you’re an economist and you’re very interested in AI, what sort of work would you be trying to do? 00:31:30,506 --> 00:31:32,146 [Seth] Maybe to be useful to AI people- 00:31:32,146 --> 00:31:32,216 [Andrey] Yes 00:31:32,216 --> 00:31:37,466 [Seth] ... in particular. What would you want, what did the DeepMind team want to read from economists? 00:31:37,466 --> 00:32:20,766 [Seb Krier] I think kind of engaging with their assumptions or something, right? If you assume, let’s say, an AG-- and I think some do, to be fair. I actually think there’s a lot more, I think, discourse now going on between economists and AI people, whatever. But assuming that you do have AI systems that are interchangeable or almost quasi-fully substitutable with humans, that come up with good ideas, that are parallelizable and whatnot, what does that change to your kind of growth function and so on? So, maybe that’s useful. Right now, in the short term, at least, there’s all sorts of questions around labor, there’s questions around productivity or adoption. Clearly, there’s useful work to be done there. But I think in terms of AGI specifically, given that a lot of the field just thinks you’re going to get to AGI in the next five to 10 years, 00:32:22,746 --> 00:32:26,806 [Seb Krier] what are the implications for taxation? What are the implications for 00:32:28,626 --> 00:32:37,786 [Seb Krier] how that’ll affect different states across the world? I think I’m probably more worried about a call center in Hyderabad than I am about the white-collar worker in North America or something. So, 00:32:39,066 --> 00:32:57,306 [Seb Krier] yeah. I think all these kind of questions, but just indexing more and making fewer, I guess, assumptions around the limits of capabilities. Because sometimes you see them kind of being implicitly snuck in somewhere or something of like, well, because AIs can’t do XYZ, therefore... And yeah, fine, but maybe they will do XYZ. And then what? How does that change your thinking? Yeah. 00:32:57,306 --> 00:32:59,506 [Seth] Maybe more scenario planning than, 00:33:00,526 --> 00:33:04,746 [Seth] here’s my median projection, or here is one projection I think is plausible. 00:33:04,746 --> 00:33:22,846 [Seb Krier] Yeah. And embedding the kind of thoughtful models and thinking that economists have within these scenarios and making them more salient to the kind of computer scientists, right? Even when I brought up competitive advantage, people will be like, “Oh, but what if the AI is cheaper and better?” It’s like, well, that’s not the point. The opportunity cost point of competitive advantage, there’s a difference. 00:33:22,846 --> 00:33:23,286 [Andrey] [laughs] 00:33:23,286 --> 00:33:31,786 [Seb Krier] And again, there are answers to that as well, but I think just kind of better translating, I think, some of these insights to the AI tribe, the thing is useful. 00:33:32,846 --> 00:33:40,526 [Andrey] So that’s very naturally leading us to this question about yourself. And you do lots of different things. 00:33:41,946 --> 00:33:50,426 [Andrey] You’re prolific on Twitter, for sure. But also, you’re doing internal work for DeepMind. How do you allocate your time? 00:33:52,066 --> 00:33:52,166 [Seb Krier] I don’t know. 00:33:52,166 --> 00:33:53,266 [Seth] What percentage is Twitter? 00:33:53,266 --> 00:33:54,646 [Andrey] Yeah. [laughs] 00:33:54,646 --> 00:34:04,686 [Seb Krier] Twitter is actually not that much today. It must be an hour max or something, an hour and a half, two hours, maybe, something. But that is maybe much by others’ standards. But the- 00:34:04,686 --> 00:34:06,476 [Andrey] [laughs] What is the optimal amount of Twitter? [laughs] 00:34:06,476 --> 00:34:29,866 [Seb Krier] [laughs] Yeah. It’s the Pareto optimal. I guess, in my day-to-day work, it’s a mixture of proactive and reactive. Proactive in the sense that I think, oh, these questions of agents and cybersecurity and liability and whatnot, and biosecurity are kind of important things to look into, and therefore, there’s a lot of research that I do and colleagues do, and a lot of coordination across the org. 00:34:31,026 --> 00:34:39,486 [Seb Krier] But there’s also more reactive stuff because we’re a policy team, and so there’s things happening in the external world like CA 53, the preemption debates. 00:34:40,546 --> 00:34:48,386 [Seb Krier] So it’s a bit of a mix of that. And of course, all sorts of internal dynamics. But, yeah. I guess I’m curious about all sorts of other things, and so when I do have time, and I’ve kind of 00:34:50,006 --> 00:34:58,106 [Seb Krier] completed the main quests, I try to keep some time for other stuff I’m interested in. I work with some research teams and kind of look into what they’re into. I’ll 00:34:59,266 --> 00:35:09,826 [Seb Krier] find topics or themes that I think are maybe kind of neglected or underrated or I just don’t see out there as much, and like, “Oh, cool. We’re going to try to find out about this more.” But I think it’s just very kind of curiosity driven, and the allocation of time is 00:35:11,566 --> 00:35:16,705 [Seb Krier] not super thought out. It’s more like, oh, I think these things are interesting, and I’m going to get into that for a bit. [laughs] 00:35:16,706 --> 00:35:22,306 [Andrey] So it wasn’t a deliberate strategy of getting Tyler’s attention and adoration. [laughs] 00:35:22,306 --> 00:35:25,126 [Seb Krier] No, not at all. Not at all. But I’m very- 00:35:25,126 --> 00:35:25,746 [Seth] The long play 00:35:25,746 --> 00:35:30,565 [Seb Krier] ... very grateful for his... [laughs] For the meme. But- 00:35:30,566 --> 00:35:41,766 [Seth] What kind of, but I know you can’t be specific, but for your sort of internal work, what does a work product look like? Are you participating in a meeting and giving hot takes? Are you writing internal memos? What is- 00:35:41,766 --> 00:35:42,026 [Seb Krier] Yeah 00:35:42,026 --> 00:35:42,276 [Seth] ... in- 00:35:42,276 --> 00:35:56,406 [Seb Krier] It’s a mixture. Obviously, meetings. Any large bureaucracy will have meetings. But I think a lot of analysis, memos to execs sometimes. Just research, managing researchers sometimes, depending on the project. 00:35:57,626 --> 00:36:04,106 [Seb Krier] We’ll have a lot of coordination. Actually, I’m realizing through a lot of these kind of meetings, a lot of it is just kind of coordination and information transfer, right? 00:36:04,106 --> 00:36:04,146 [Andrey] [laughs] 00:36:04,146 --> 00:36:07,006 [Seb Krier] It’s maybe why I’m so obsessed with the Coasean bargaining thing. Just let- 00:36:07,006 --> 00:36:07,326 [Seth] Ah 00:36:07,326 --> 00:36:08,546 [Seb Krier] ... the agents do it. But, 00:36:09,806 --> 00:36:34,116 [Seb Krier] yeah. I think the day-to-day work is a lot of reading, a lot of meetings, a lot of writing, and distilling and translating information, I think, across different tribes also. So if I’m talking to legal people, like lawyers, about what’s going on in, say, the more technical side of the org, or if I’m speaking to the researchers about something that’s more... But yeah, there’s a lot of translating of concepts across different stakeholders, I guess. 00:36:34,116 --> 00:36:45,726 [Andrey] So how does that work in an org like Google? Because I think in a lot of orgs, they’re really obsessed with KPIs and output metrics. 00:36:45,726 --> 00:36:46,156 [Seb Krier] Mm-hmm. 00:36:46,156 --> 00:36:48,746 [Andrey] And what you’re describing sounds very- 00:36:48,746 --> 00:36:49,706 [Seth] Hot takes per meeting. [laughs] 00:36:49,706 --> 00:36:54,926 [Andrey] Yeah. Very much amorphous, very hard to measure. 00:36:56,066 --> 00:36:56,196 [Seb Krier] Yeah. 00:36:56,196 --> 00:37:00,606 [Andrey] Obviously, you have a lot of external visibility, but is that 00:37:02,786 --> 00:37:07,846 [Andrey] a problem? Or is that just it’s understood that that’s how this goes? Yeah. 00:37:07,846 --> 00:37:13,846 [Seb Krier] I think the external stuff is kind of almost just very separate from the kind of day-to-day work side of things. 00:37:14,986 --> 00:37:23,366 [Seb Krier] And yeah, internally, we do have KPIs or equivalents or whatever. I think they may be less numerical in nature. But you might still have some, develop a consistent position on 00:37:24,506 --> 00:37:30,819 [Seb Krier] X issue or something in the next two, three months.And that requires a lot of research work, coordinating. 00:37:30,819 --> 00:37:32,929 [Seth] Have 10 opinions. [laughs] 00:37:32,930 --> 00:37:38,100 [Seb Krier] No, ideally they just want one. I think 10 opinions, that’s the issue. There are a lot of opinions out there. You’ve got to find the good ones. 00:37:38,100 --> 00:37:39,530 [Seth] That’s the main problem with economists. 00:37:39,530 --> 00:37:42,350 [Seb Krier] But [laughs] yeah. Exactly. Who was that quote? 00:37:43,830 --> 00:37:44,290 [Seth] Truman. 00:37:44,290 --> 00:37:44,330 [Seb Krier] Yeah. 00:37:44,330 --> 00:37:46,210 [Seth] Truman begged for the one-handed economist. 00:37:46,270 --> 00:38:20,990 [Seb Krier] Yeah, exactly. But, so I think, yeah, I think internally it’s just a kind of analysis or something. Say you’re thinking about, oh, agents and legal liability. How do these things work? What does the existing legal environment say and prescribe? What happens if something goes wrong? What are relevant factors? There’s a lot of that kind of thing. And I guess particularly within the DeepMind side, because when we’re on the frontier side, we’re thinking about the next five years as opposed to what’s going on right now. But yeah, the other side stuff is really just kind of out of personal interest and just me writing stuff, and they seem fine with it so far. [chuckles] 00:38:20,990 --> 00:38:26,510 [Andrey] What about... So we’ll be at a conference together, the Post-AGI conference- 00:38:26,510 --> 00:38:26,830 [Seb Krier] Ooh 00:38:26,830 --> 00:38:28,370 [Andrey] ... at Lighthaven, Berkeley. 00:38:28,370 --> 00:38:30,110 [Seth] Ooh. Prestigious. 00:38:31,130 --> 00:38:32,990 [Andrey] I don’t know if it’s prestigious. 00:38:34,550 --> 00:38:34,629 [Seth] [laughs] 00:38:34,630 --> 00:38:45,730 [Andrey] But you’ve gone to a few of these conferences, like the Curve is another fairly well-known one. What’s your take on these? 00:38:45,730 --> 00:38:54,750 [Seb Krier] I think some are useful. The majority of conferences I go to, I don’t exactly find that life-transforming, I guess. 00:38:54,750 --> 00:38:57,610 [Andrey] [laughs] You’re going to the wrong conference. [laughs] 00:38:57,610 --> 00:39:09,290 [Seb Krier] I know. Can someone show me the... But I think, yeah, they obviously perform a social function to some degree, right? There’s a lot of meeting people, some networking or something, some kind of finding out new ideas. But 00:39:10,390 --> 00:39:20,310 [Seb Krier] my issue with conferences, very often they’re just very tame. They’re very risk-averse. They’re very the same ideas you’ve-- Already if you can read it online or something, it depends on the conference. But, 00:39:21,510 --> 00:39:24,190 [Seb Krier] although I have been to really good ones, too. There was this 00:39:25,570 --> 00:39:43,529 [Seb Krier] IMF conference with Econ Ty, with I think Anton Korinek and others had organized. And that was great because that was a nice one where you had both the technologists and a lot of economists and loads of presentations, and you got to learn lots of new things. But, in general, I don’t see a huge... Beyond maybe showing, again, some hot takes here and there. 00:39:45,370 --> 00:39:49,990 [Seb Krier] Yeah, some I assume are good conferences. [chuckles] 00:39:49,990 --> 00:40:00,670 [Seth] I’m just the exception, but you had a great joke on your Twitter the other day about this, which is, Caveman panelist one, “Fire is bad.” Caveman panelist two, “Fire is good.” 00:40:00,670 --> 00:40:00,770 [Seb Krier] Yeah. 00:40:00,770 --> 00:40:02,100 [Seth] Caveman panelist three, 00:40:03,450 --> 00:40:07,120 [Seth] “We need to balance the upsides and downsides of fire and use it wisely.” 00:40:07,120 --> 00:40:07,320 [Seb Krier] Absolutely. 00:40:07,320 --> 00:40:09,620 [Seth] Wild applause. [laughs] 00:40:09,620 --> 00:40:09,650 [Andrey] [laughs] 00:40:09,650 --> 00:40:14,850 [Seb Krier] Exactly. There’s a lot of that. That’s the energy that I’m getting very tired of because it’s- 00:40:14,850 --> 00:40:15,050 [Seth] [laughs] 00:40:15,050 --> 00:40:21,700 [Seb Krier] And I like playing the role of the wise centrist opinion, whatever. But it does get very- 00:40:21,700 --> 00:40:23,150 [Seth] You do get wild applause. 00:40:23,150 --> 00:40:24,470 [Seb Krier] Yeah. All the time. [chuckles] 00:40:26,490 --> 00:40:29,770 [Seb Krier] But yeah, I think there’s a lot of that. I wish there were more 00:40:30,810 --> 00:40:35,090 [Seb Krier] almost private Chatham House-y conferences, where you had people who highly disagreed with each other- 00:40:35,090 --> 00:40:35,210 [Andrey] Mm 00:40:35,210 --> 00:40:36,770 [Seb Krier] ... but were polite and didn’t get at 00:40:37,950 --> 00:40:49,370 [Seb Krier] each other’s throats. And you had more setups that actually allowed ideas to clash a bit more, in a civilized way, of course. But that would be a bit hard, but also much more interesting, I think, than 00:40:51,490 --> 00:40:55,390 [Seb Krier] everyone broadly agreeing that it’s good to be good and it’s bad to be bad, and yeah. [chuckles] 00:40:55,390 --> 00:41:03,710 [Andrey] I do feel like the Lighthaven conferences are quite good for this, in that there’s an enormous amount of free time and- 00:41:03,710 --> 00:41:04,130 [Seb Krier] Mm-hmm 00:41:04,130 --> 00:41:07,770 [Andrey] ... free space that’s not where the talk is happening. 00:41:07,770 --> 00:41:07,940 [Seb Krier] Yeah. 00:41:07,940 --> 00:41:10,630 [Andrey] And so you do get a lot of this. 00:41:10,630 --> 00:41:11,040 [Seb Krier] Well, yeah, I agree. 00:41:11,040 --> 00:41:21,090 [Andrey] But I agree that many conferences are not like that, where you’re just packed. You have a conference hall, and you don’t have anywhere else to go, and it’s packed with talks. Yeah. 00:41:21,090 --> 00:41:21,710 [Seb Krier] Yeah. No, totally. 00:41:21,710 --> 00:41:23,550 [Seth] NBER Summer Institute. [laughs] 00:41:24,750 --> 00:41:28,330 [Andrey] Seth, there is disagreement. Say what you will. At NBER- 00:41:28,330 --> 00:41:28,540 [Seth] There is fire 00:41:28,540 --> 00:41:29,430 [Andrey] ... people throw down. 00:41:30,450 --> 00:41:31,430 [Andrey] [laughs] 00:41:31,430 --> 00:41:37,720 [Seth] [laughs] I’ve never seen a meaner comment than I have seen from a discussant at NBER Summer Institute. [laughs] 00:41:37,720 --> 00:41:52,570 [Seb Krier] [laughs] The Progress Conference, for example, last year, was one that I thought was really good. That was at Lighthaven, in fact. I think the setup and the kind of people and the curation and so just made it something that I found quite engaging. [upbeat music] 00:41:52,570 --> 00:41:56,490 [Seth] So you brought up this idea, as we were talking, about you 00:41:58,330 --> 00:42:21,049 [Seth] think there are so many meetings in your organization because it’s so hard, yet so critical to transfer information. And there’s this Coasean idea that so much of why the economy works the way it does is just the idea of transaction costs, right? In addition to kind of this Hayekian idea of local information that’s hard to share. 00:42:21,050 --> 00:42:21,810 [Seb Krier] Mm-hmm. 00:42:21,810 --> 00:42:23,960 [Seth] You have a very influential essay 00:42:25,130 --> 00:42:30,230 [Seth] that kind of maybe stole some of Andrey’s thunder, but is still an excellent essay- 00:42:30,230 --> 00:42:31,040 [Seb Krier] [laughs] 00:42:31,040 --> 00:42:46,210 [Seth] ... about this idea of, well, what happens when AIs go out there and can micro-bargain costlessly with each other at high frequency over very, what might seem to us, small issues. 00:42:47,570 --> 00:42:57,440 [Seth] Tell us maybe in a few sentences, what’s that vision and what’s the positive vision for why that would be good for society, for us to have AI agents constantly bargaining for us over stuff? 00:42:59,130 --> 00:43:01,810 [Seb Krier] Yeah. I guess the idea is, as you mentioned, there’s all sorts of 00:43:03,990 --> 00:43:26,350 [Seb Krier] transaction costs that mean that we don’t get to bargain on things that we would otherwise bargain for. And instead, you get these blunt rules and these solutions that kind of work, but come with all sorts of externalities or aren’t super efficient. And so the idea is, if you can actually do this kind of negotiation at scale for very little, and that’s a big assumption. That’s not a given either, 00:43:27,850 --> 00:43:35,586 [Seb Krier] then you could solve all sorts of things thatAnd also just kind of problems that would otherwise not be even conceivable in the first place. 00:43:36,726 --> 00:43:41,186 [Seth] One example you give, just so we can be a little bit more specific, is noise standards, right? 00:43:41,186 --> 00:43:41,456 [Seb Krier] Right. 00:43:41,456 --> 00:43:57,226 [Seth] So you can’t throw a loud party after 10:00 PM in such and such a place. But you think that maybe AI agents could come to a less coarse rule that is, get us more to the grand coalition of allocative efficiency than a coarse rule like that. 00:43:57,226 --> 00:44:01,166 [Seb Krier] Yeah. To be fair, that’s probably a problem that no one really cares about except me because of like- [chuckles] 00:44:01,166 --> 00:44:02,086 [Seth] No. Dude. 00:44:02,086 --> 00:44:03,645 [Andrey] I care about it so much. 00:44:03,645 --> 00:44:04,626 [Seb Krier] Oh, really? Okay, cool. 00:44:04,626 --> 00:44:04,746 [Andrey] Yes. 00:44:04,746 --> 00:44:07,816 [Seb Krier] Maybe that’s a good example then. But yeah, the idea here is, 00:44:09,146 --> 00:44:17,006 [Seb Krier] my neighbor is throwing a party, and instead of there being some sort of rule that says you’re not allowed to throw parties after 11:00, he could maybe just compensate me for the noise or something. 00:44:18,326 --> 00:44:21,686 [Seb Krier] Or in fact, that’s one of the key crux of the whole Coasean thing is maybe 00:44:24,186 --> 00:44:36,085 [Seb Krier] I have to compensate him to stop his parties. And it kind of depends where the initial right is. But broadly, you could have these kind of, my whole neighborhood doesn’t want me to party, and they’re just giving me a small payment or the reverse, depending on where the initial allocation is. 00:44:37,226 --> 00:44:44,446 [Seb Krier] But I think you could have all sorts of micro ways in which these transaction costs at scale help you get much better beneficial outcomes. 00:44:45,486 --> 00:44:48,486 [Seb Krier] And so that would be the noise one would be like, okay. 00:44:50,406 --> 00:45:18,666 [Seb Krier] And it’ll probably just also let people kind of regroup into the party people just going into the neighborhood where that’s just generally more party tolerant or something, and the kind of peace and quiet preferring people just... Because I think one of the points with the piece was that AI also helps you coordinate better. You can use this stuff to find people who have the same interests and preferences as you or something, and just then bargain or negotiate or whatnot in that way as well. 00:45:20,626 --> 00:45:27,386 [Seth] So it’s not just bargaining over externalities that are negative, it’s maybe coordinating over positive externalities, right? 00:45:27,386 --> 00:45:27,526 [Seb Krier] Yeah. 00:45:28,766 --> 00:45:51,746 [Seth] What pieces do we need in the economy to make this a reality, and what time horizon are you thinking about? So obviously this is an idea that you could have a small version of, and then like the sci-fi, this is constantly, I’m allowed to speed in my car today because I really need to get to work because I’m late, and it’s bargaining with all the cars on the highway at ultra-high frequency. So what are the time horizons you have in mind, and what pieces do we need? 00:45:51,746 --> 00:46:21,786 [Seb Krier] Honestly, I haven’t even thought about the timelines really. [laughing] For me, this was mostly kind of an aspirational thing of like, well, it looks like we could unlock some cool things, and because there’s all these-- It’d be nice to have a positive vision of how things might pan out. It certainly doesn’t mean that everything has to be negotiated and bargained over. But I could see a large proportion of things, certainly in everyday life, like I could just tell my aunt, “You don’t have to worry about your parking issues anymore. It’s just sorted now,” whatever. The agents are taking care of that. And so it kind of depends on what scale you’re talking about. Certainly having democracy at scale and 00:46:23,626 --> 00:46:29,086 [Seb Krier] half automated and half made more efficient through these systems or something is something that I think is going to take a long time. 00:46:30,426 --> 00:46:47,986 [Seb

19. touko 2026 - 1 h 23 min
jakson Avi Goldfarb on Prediction Machines, O-Ring Tasks, and How AI is Reshaping Economics kansikuva

Avi Goldfarb on Prediction Machines, O-Ring Tasks, and How AI is Reshaping Economics

This week, we’re joined by Avi Goldfarb, one of the leading economists of artificial intelligence and co-author of Prediction Machines [https://www.google.com/search?sca_esv=bc87673d3ad1280f&rlz=1C1GCEA_enUS1209US1209&sxsrf=ANbL-n4AnrHPqrHiXM4Cb3oXCBXAennzbw:1777914708243&q=Prediction+Machines:+The+Simple+Economics+of+Artificial+Intelligence&stick=H4sIAAAAAAAAAONgFuLVT9c3NEwzqCw0q8wrU4Jw003S0pMLsnK1pLKTrfST8vOz9RNLSzLyi6xA7GKF_LycykWsLgFFqSmZySWZ-XkKvonJGZl5qcVWCiEZqQrBmbkFOakKrsn5efm5mclADWkKjkUlmWmZyZmJOQqeeSWpOTmZ6al5yakAebQ6E4MAAAA&sa=X&ved=2ahUKEwjFtIC1kKCUAxWiJkQIHRQiDEoQ9OUBegQIDRAD&biw=2183&bih=1080&dpr=1.75]. Avi has been thinking seriously about AI economics long before the ChatGPT shock, so we asked him what he thinks the earlier framework got right, what it missed, and how economists should update their beliefs now. The conversation starts with Avi’s seminal book, Prediction Machines, and the idea that AI is best understood as a drop in the cost of prediction, which is a complement to judgement. We ask what that book got right and what it got wrong. From there, we interrogate Avi on the murky boundary between prediction and judgment. We had investigated the idea that maybe judgment and prediction were not as separable as economists like to believe in our episode with Alex Imas [https://empiricrafting.substack.com/p/alex-imas-demand-collapse-bargaining]. We also ask whether, if AI gets better at predicting human judgment, whether judgment disappears, or do humans simply “move up the stack”? And what is taste exactly? Avi says that sometimes judgment becomes predictable, but humans still matter because goals, values, organizational politics, and “what matters” are often implicit, unstable, and hard to codify. Avi shoots down Seth’s galaxy-brain suggestion that correct ontology choice — i.e., deciding what sort of natural kind [https://en.wikipedia.org/wiki/Natural_kind] a thing is, or understanding when a problem is out of context [https://theculture.fandom.com/wiki/Outside_Context_Problem] — is a uniquely separate skill (taste?), calling it just another prediction error. But he does concede that deciding how much to prepare for ‘Black Swan’ events may be an enduring role for judgment. We then revisit the O-ring theory of production and what it means for automation. We had covered Kremer’s article in a recent episode (see here [https://empiricrafting.substack.com/p/weak-links-strong-predictions-kremers]) and asked Avi about his new paper, riffing on the idea at the worker level [https://www.nber.org/papers/w34639]. Avi says that if tasks inside jobs are complements rather than substitutes, then automating one task may make the remaining human tasks more valuable, not less. Avi explains why workers may reallocate attention toward the tasks machines cannot yet perform (shooting down Seth’s suggestion that this is actually difficult in most jobs). The discussion also covers whether AI will augment or replace workers, whether governments should try to steer AI toward human-complementing technologies, and why that distinction may be much harder to define in practice than it sounds. Avi agrees with Andrey and Seth’s pushback on “augmentation good, automation bad” framings (e.g. friend of the show Erik Brynjolfsson’s “Turing Trap [https://digitaleconomy.stanford.edu/news/the-turing-trap-the-promise-peril-of-human-like-artificial-intelligence/]”). Then we get into forecasts: how fast AI capabilities might advance by 2030, what that means for GDP growth by 2050, whether GDP is still the right thing to forecast, and why even very powerful AI may run into bottlenecks in the real economy. We use the paper Forecasting the Economic Effects of AI [http://Forecasting the Economic Effects of AI] to ground the discussion. We close with lightning-round topics including AI’s impact on centralization, privacy/de-anonymization, peer review, and whether academic journals still serve the function they once did. Papers, books, and ideas mentioned * Avi Goldfarb’s seminal book with Ajay Agrawal, and Joshua Gans — Prediction Machines [https://www.google.com/search?sca_esv=bc87673d3ad1280f&rlz=1C1GCEA_enUS1209US1209&sxsrf=ANbL-n4AnrHPqrHiXM4Cb3oXCBXAennzbw:1777914708243&q=Prediction+Machines:+The+Simple+Economics+of+Artificial+Intelligence&stick=H4sIAAAAAAAAAONgFuLVT9c3NEwzqCw0q8wrU4Jw003S0pMLsnK1pLKTrfST8vOz9RNLSzLyi6xA7GKF_LycykWsLgFFqSmZySWZ-XkKvonJGZl5qcVWCiEZqQrBmbkFOakKrsn5efm5mclADWkKjkUlmWmZyZmJOQqeeSWpOTmZ6al5yakAebQ6E4MAAAA&sa=X&ved=2ahUKEwjFtIC1kKCUAxWiJkQIHRQiDEoQ9OUBegQIDRAD&biw=2183&bih=1080&dpr=1.75#] * A black swan is the occurrence of a wildly unpredictable event, which Nassim Taleb argues, in his book by the same name [https://en.wikipedia.org/wiki/The_Black_Swan:_The_Impact_of_the_Highly_Improbable], is more common than we like to think * A New Riddle of Induction [https://en.wikipedia.org/wiki/New_riddle_of_induction] — by Nelson Goodman — is the source of Seth’s thought experiment about “bleen”, a color which is green until 2029 and blue after, and green * Michael Kremer — “The O-Ring Theory of Economic Development”, covered in this episode of the pod: * Daron Acemoglu and Pascual Restrepo’s task-based models of automation, especially “The Race Between Man and Machine [https://www.aeaweb.org/articles?id=10.1257/aer.20160696].” * Avi mentions David Autor and Ben Thompson on automation and skill scarcity when Seth comments that you may not be able to reallocate effort between tasks as a worker, including their paper “Expertise [https://www.nber.org/papers/w33941]” * Erik Brynjolfsson in the “Turing Trap [https://digitaleconomy.stanford.edu/news/the-turing-trap-the-promise-peril-of-human-like-artificial-intelligence/]” argues that automation technologies are less good than augmenting technology * Eric Topol’s book on AI in medicine — Deep Medicine [https://www.amazon.com/Deep-Medicine-Artificial-Intelligence-Healthcare/dp/1541644638] * John Markoff — Machines of Loving Grace [https://www.amazon.com/Machines-Loving-Grace-Common-Between/dp/0062266683] — The source of a title for an influential essay of the same name [https://www.darioamodei.com/essay/machines-of-loving-grace] by Dario of Anthropic. Both draw from an earlier poem about a Sci Fi utopia: https://allpoetry.com/All-Watched-Over-By-Machines-Of-Loving-Grace * Korinek and Stiglitz on AI, capital, and taxation; Lockwood and Korinek on optimal taxation and automation — We covered these topics at the end of our episode with Basil Halperin in the context of “Tax Policy at the End of History” around the 1:19:00 mark * We talk about de-anonymization, and Avi references this provocative paper [https://arxiv.org/abs/2409.15948] from Florian Ederer * Avi brings up Bob Gordon, and his argument, famously in the book The Rise and Fall of American Growth [https://www.amazon.com/Rise-Fall-American-Growth-Princeton/dp/0691147728], that the early 20th century was incredibly important for increases in US living standards, which digital technologies have not lived up to * Digital Hermits [https://www.nber.org/papers/w30920], by Jeanine Miklós-Thal, Avi Goldfarb, Avery M. Haviv & Catherine Tucker, is a paper by Avi thinking about how information spillovers, now from AI, drive some people to be more private than they would otherwise be. In our conversation, we speculate AI will make these hermits even more “hermetic” * We discuss this paper on new forecasts of AI and its impact on economic growth: Forecasting the Economic Effects of A [http://Forecasting the Economic Effects of AI]I * Refine and AI-assisted peer review are discussed in this pod. For more, see our episode with Ben Golub, founder of Refine [https://empiricrafting.substack.com/p/ben-golub-ai-referees-social-learning]. This episode is sponsored by Revelio Labs [https://www.reveliolabs.com/] — a great source of labor economics data for academics and firms. Now available on WRDS. Join our Discord community at this link: https://discord.gg/w3GSapx2d Transcript Introduction [00:00] Seth: Welcome to the Justified Posteriors podcast, the podcast that updates beliefs about the economics of AI and technology. I’m Seth Benzell, your loyal non-fiction machine, coming to you from Chapman University in sunny Southern California. Andrey: And I’m Andrey Fradkin, coming to you from San Francisco, California. And we are very happy that Justified Posteriors is sponsored by the fine folks at Revelio Labs. And we’re very delighted to have Avi Goldfarb, who is a leading thinker in the field of AI economics and has also been a personal mentor on the show. We’re very excited to hear his thoughts on a variety of topics. Welcome, Avi. Avi: Thanks so much and thanks for having me on the show and looking forward to it. Andrey: All right, let’s get started. I have in front of me this book that you might remember writing at some point. Seth: Gaze into the soul of the man in the bookstore. What Did Prediction Machines Get Wrong? [01:12] Andrey: Now, I just think it’s a good cover. And I had to check: when was it released? It was released in 2018. And as I was skimming through it, you know, a lot of interesting points made there are still things that we’re talking about today, almost 10 years after it was released. So let me start off with the following question. And then maybe we can work backwards more into the ideas in the book. But what do you think prediction machines got wrong? Avi: I think prediction may... I’ll start with a hard question. Seth: No softballs on Justified Posteriors. Avi: So on the specifics of which industries and when, to the extent we tried, at least I did not anticipate how quickly language and coding would become prediction problems. And when we talk about disruption and industry disruption, a lot of the examples are things like driving, and we talk about radiology. And we still have plenty of radiologists around. Self-driving cars and trucks. seem like they’re now imminent, but it certainly took a lot longer than we expected back in 2018. Andrey: So is it a fair assessment to say that the large language models, even in 2018, weren’t on your radar? I guess they weren’t on many people’s radar. The Three Ideas of Prediction Machines [02:45] Avi: Not really. We have some discussion of machine translation. So that’s in there as a huge potential use case, but the arrival of ChatGPT and how it sort of changed how we interact with machines and how we think about AI was not really there. Another way to put it is prediction machines had three ideas. So idea number one is AI can be framed as a drop in the cost of prediction. So prediction. As in filling in missing information, statistical prediction is getting better, faster and cheaper. Idea number two is that when something gets cheap, you start using it for unanticipated uses. So when arithmetic got cheap, it wasn’t just that we use computers for accounting. We started to use computers for all sorts of things that we never used to think of as arithmetic problems like imaging and mail and music. And then idea number three is what are the complements to machine prediction? And we talked about data and judgment. The book, and certainly our attention to the book in the first three or four years after it was published, was on idea number one and idea number three. So identify prediction problems in your organization, and then think about what data you need to make those predictions better, and try to understand what matters to you in terms of judgment. And that second point kind of got lost. But in the last four years, it’s become clear to me is that that second point was maybe the biggest one, which is this tool, which still under the hood is computational statistics, enables us to find all sorts of applications for computational stats that we didn’t really imagine before. Judgment and data are still gonna be useful, but that phase one, that step one, that first idea of identifying prediction problems, that’s not really how we think about using AI today. And in some sense, that... was a missing emphasis throughout the book and throughout how we thought about that book, or at least how I thought about that book for the first few years. Does Proprietary Data Still Matter? [04:59] Andrey: Very interesting. You mentioned one kind of underlying idea there, whereas you should identify the data that’s going to make your predictions better. Do you think to what extent is that now true, given that your foundation models seemingly can be very smart without having any proprietary data? Avi: Data is still central to the use of AI, the building of the models. In building a foundation model that, at least in the pre-training stage, that data is essentially interchangeable. You just need more. It doesn’t really matter what. To build a structure of language, and then you can move from there. On later stages of using that model, at least the AI companies seem to think data is valuable to the model companies. And then in terms of use cases within organizations, that’s more a matter of whether you want to delegate sort of the judgment of how to use the model and what the model should output to the vendor or whether it’s something that you need to build in-house. And depending on the organization, some of them are very happy to delegate to the foundation model provider and some of them think they need to fine tune in-house. Andrey: Well, so there are kind of two little sub ideas in there. One is you have choice. You can fine tune a worse model with your own data. And maybe that will outperform as a frontier model. I think for many cases so far, that’s been a bad bet. But there’s a different idea here. Use whatever model you want, but you design the evaluation. And then you optimize via the prompting strategy or scaffolding towards that. that benchmark for your own use case. Is designing a benchmark proprietary? Should we think of that as a proprietary data that an organization has? Seth: Is that the judgment part in the judgment prediction distinction? Vendor Choice as Delegated Judgment [07:01] Avi: Yeah, I think there’s a bunch of judgment. there’s judgment number one: which which vendor do you use? Because you’re delegating a lot of values as in like, knowing what matters to the maker of the model. And then there is judgment in how heavy-handed do you want to be to make the outputs fit your needs? And then there’s judgment on, okay, you’ve decided to be heavy-handed. What exactly does that mean? And is it, guardrails or is it really making sure that the output from the prompts every time fits your organization’s values or what matters to you? Andrey: Have you had an opportunity to kind of advise companies on this judgment decision? Like what has your experience been in these situations? Avi: At a high level, yes. I don’t want to exaggerate my experience, but the things I emphasize and the things that seem to resonate are, one, what I just said, which is recognizing when you choose a vendor, you are delegating your understanding of what matters to that vendor. And then two, that means before you start thinking about choosing a vendor, you need to know what matters to you. So think through, you know, before you go talk to somebody, you should know what your KPIs are and what outcomes you want to see. Because otherwise, once you talk to them, they’ll convince you that their outcomes are the ones you want to see. and so it’s this, I talked to, someone who is running an AI at a... Let’s call it a big healthcare organization. And his job used to be, like five years ago, his job was building tools. He’s like, my job isn’t building tools anymore. There are all sorts of vendors building AI tools for healthcare. Okay. And what my job is now is every week, 20 or more people come in and say, I have a solution for you. And he chooses one or two of them. Seth: Kind of seems like a good job for an AI. Avi: Well, maybe, maybe not. But he understands the individuals, the people, guess, in theory that could happen, but the individuals in his organization, what they’re willing to accept, what they don’t. Which decisions they like to have control over, which ones they’re comfortable delegating. For the ones they like to have control over, he has a sense of what might be negotiable and what might not be. He knows where the power structures are and what things might change. Therefore face resistance from people who have the power to resist. He knows those things that might not face resistance from people because the people don’t have power to resist, but they’re going to be really, really unhappy about it. It’s going to bad for the organization. And so there’s all these things that I guess in principle an AI could do, but we’re a long way away, I think, from that. Can Prediction Eat Judgment? [10:16] Seth: So let me let me just push down that line a little bit longer is the way to think about this sort of prediction and judgment distinction is is that like as the models get better the Prediction is like eating more and more of the stack right? You know we give the information about our organizational structure to the AI and then maybe it can make a couple more of these decisions for us And you could either imagine that asymptoting to, you know, in 20 years, AI does everything, or you could imagine there are higher and higher levels of judgment that humans keep on getting promoted to. Are one of those two ways the way that you think about it? Avi: Yes, Andrea Pratt has a note in our first Economics of AI volume that covers that exact idea. I think actually it’s a comment on our paper or the model behind the Prediction Machines book. it’s, well, in principle, with enough data, you can learn to predict judgment. And so you move up the stack. So absolutely. There are some limits to that. There’s limits on you may never get enough data. on that kind of judgment. Judgment can change over time. To the extent that ultimately you’re trying to predict your tastes, then they can change over time. And there’s some limits on causal inference and the impossibility of seeing the counterfactual, which creates a need for a model. Andrey: But humans have that problem too. Avi: Yeah, yeah, yeah, no, I agree. But in the need for a model. So then the question is, well, how come LLMs and some of these models seem to be pretty good at doing that? And in the process of prediction, I suspect -- though I don’t know rigorous work on this, so I’m being cautious -- Seth: That’s what this podcast is for. Avi: this is building some kind of model of the world that is embedded in the training data, like the language. Taste, Values, and Human Wants [12:16] Seth: So let’s go back to the one of the examples you gave, which is this idea of taste, right? Because I’ve had so many conversations with other economists about this idea that, well, taste will save us as a scientist, right? Because the AI won’t have taste. I have some ideas about what taste might mean, but can you be a little bit more precise about what you think taste means and why it’s something worth saving? Avi: So, okay, let’s operate under the assumption that whatever we want to call the machines, their goals are to help humans. Okay, not all humans. And we can debate about which humans, but like ultimately. Seth: Well, the Anthropic Constitution says, you know, safety first, the idealized anthropic researcher, then the guy that then then like virtue and then like the customer in some order like that. Avi: I’m gonna, all that matters for the point I’m about to make is that it’s not about the machine’s needs. So in that case, at the very limit, humans have wants and needs and those wants and needs, the machines need us, our judgment to know what our wants and needs are. Seth: So taste literally as in, this tastes good to me, I want more of this food. Avi: That would be one specific example of it. Absolutely. Okay. Now, I think we’re a long way from that limit, but that’s what I would argue the limit is. Seth: That’s the Bailey, right? So now let’s go out to the motte. Avi: So then it’s more like, okay, what matters to a set of humans, a group, an organization? What can we codify? If you can codify it and say, like, this is your goal, you’re not quite at that limit, but pretty close to it, then the machines can try to optimize on a goal. Goals have so much that are implicit. And so the machine would have to be able to infer the implicit part. Maybe it can, maybe it can’t, I don’t know. And then you can sort of ratchet back all the way to where we are now, which is you still need to tell your agent what you want. You still need to check on it every once in a while and guide it in the right direction. Prompting still has a role. Ontology, Umbrellas, and Context Shifts [14:45] Seth: Here’s another way of thinking about taste. And I’m curious whether you think this is in one of the categories you already listed or a new idea or you wouldn’t call this taste, which has to do something like with the idea of your ontology that is kind of built into the system, right? It’s your way of sort of dividing the world up into parts and maybe a good tastemaker or a good judger might have a more refined or more adaptable ontology. than the prediction machine. So I’ll give you an example of what I mean. have a couple of examples in mind, but one example I have is, you know, historically in the data, it’s always been the case that if lots of people show up with umbrellas, it means that you can predict that it’s raining. But then we have these Hong Kong protests and in the Hong Kong protests, they’re the umbrella protests and people bring umbrellas to show that they’re protesting, right? And it seems like a human would do better at adapting to like the completely new context for why you would need umbrellas than, you know, a pre-trained system that was only on historical data. So you can say that that’s like a context switch problem. Is that one of your ideas of taste or is that more of a judgment that’s not a taste? Avi: Honestly, that seems like a prediction failure to me. Seth: Right. That’s just we don’t have data on the context that we’ve moved to. The job is to understand when the context has changed, maybe. Avi: The judgment, I would say the judgment is like, what’s the consequential decision that’s going to be a function of, look outside and I see a lot of people in umbrellas. Yeah. What am going to do? And. Seth: You know, I should water my plants. Should I water my plants? Avi: No, I water my plants. Okay. So I look outside, a lot of people are carrying umbrellas and I think, no, I don’t need to water my plants. Okay. And then it turns out it’s a protest. It’s a little bit of weird context, but going with your example. Seth: It’s gotta be a weird context. That’s the reason that the AI is going to make the wrong decision because it’s out of context. Avi: the, the automated sprinkler doesn’t go on and, my plants die. Right. Okay. So, the judgment is, is it then worth it for me to invest more either in my prediction technology or to actually go outside and look and to see if there’s rain, to overcome that downside. So what you described as an error in prediction, there’s ways to reduce that error in prediction. The judgment is whether it’s worth the bother to reduce that error in prediction or to create some kind of insurance system where you would say, you know what, I’m gonna water the sprinklers. I’m just gonna run the sprinklers anyway. That’s how I think about judgment. It’s sort of what goes wrong when your prediction fails or it’s one important aspect of judgment. Seth: Sorry, can I give you an even more abstract? Andrey: Wait, wait, wait. No. I actually disagree with the premise of the example in many ways. I think a reasoning model would be able to handle the situation, especially with internet access, substantially better than many humans already, because you can call an API to get the weather forecast if you’re unsure. You can read the news. You can use reasoning traces. There’s this kind of implicit assumption in your question that like, we’re just using a raw pre-trained model and like asking it to like, if you, like, if you had a gun to your head, what would you do? You know, and not use any reasoning. Seth: Okay, but I can tell you a story, right? The weather API was always reliable in the data, but now there’s been a government takeover and I don’t trust the new government and you shouldn’t trust the API weather data anymore, right? Avi: So Andrey, I actually agree with, like, that seems unrealistic, but I think the idea is what you’re describing is how many resources you wanna put toward making it right, and I would view that as judgment. Andrey: But I guess the model has that judgment, maybe. Already. Already. Yeah, that’s kind of goes out like the stack of when judgment problems become prediction problems, I guess. Avi: But then there’s going to be... well, there’s going to be some places where the model is imperfect. Okay. Yes. Still a prediction tool. It might be better than human. Actually, it doesn’t matter if it’s better than human. But to the extent the model is imperfect, how do you want to behave? Like, let’s say the model is right 99.99 % of the time. Does your behavior change at that versus 99.9999 % of the time, even if the human benchmark is 50? And that ultimately is going to is going to be essential to judgment. We do this with self-driving cars. The models aren’t perfect, but they’re better than human. And yet, I still drove to work today, partly because that’s the law in Canada. Andrey: Do you think there’s hope? I mean, maybe this is kind of too much in the weeds versus the abstract idea, but sometimes people implicitly assume that they’re anchoring on the current technology where there’s an instance of an LMM that does something. But we might be able to design systems of LLMs that are interacting with each other to cover some of these. shortcomings that we can think of. I mean, at a conceptual level, maybe it’s the same thing anyway... Avi: So maybe another way to think through these trade-offs is to talk about whose judgment, okay? Which is Seth’s example was about, or my example was about my judgment, know, the individual’s judgment and should they listen or not. Andre, I think what you’re describing is the model builder’s judgment on which things is it worth investing in making the model better and when is it okay not? Like they have choices on sort of rate and direction. And those require some understanding of what they think is going to matter in terms of the use cases, the model. And on that, yes, there is a limit where a small number of players have extraordinary power because AI scales their judgment because they embedded into the models. But I do think. then there is still a human or set of humans responsible. It’s not like, the AI did it. It’s humans making those kinds of decisions. And I understand, like, at the limit, that actually gets quite nuanced, especially once we have models with continuous learning. But that’s how I think about that problem. Grue, Bleen, and Black Swans [21:41] Seth: All right Andre, can I ask my riddle of induction question? Andrey: Do you need me to induce it? Seth: You already know where I’m going with this. I’m curious if Avi knows where I’m going with this, but this goes back to the question of maybe where taste comes in is having a better or a more human ontology than the machine. All right. Have you ever heard of grue and bleen, Avi? These are colors that are different than blue and green. No? Okay, awesome. So briefly, we have this conceptual category, which is a thing that’s green. And a thing that’s green, we think that if you don’t do anything to it, it should be green indefinitely, right? Avi: Okay, yeah. Seth: All right. There’s this other thing that’s called bleen and things that are bleen are green until the year 2029. And after 2029, they turn blue. Right. Here’s the issue is that bleen and green things are observationally identical until 2029. Right. Yeah. So an inhuman, bad at forming natural kinds, ontology of an AI might decide that something is bleen instead of thinking it’s green. Right? And a human’s role might be to say, no, that’s a bad definition of a natural kind. That’s a bad ontology. And that would be a role of either taste or judgment. Do you buy that? Is this way too abstract? Avi: I think what you’re describing is a failure of prediction. I don’t think that’s taste or judgment. The taste or judgment is if you or a machine aren’t sure if something is bleen or green, do you care? Seth: Okay. Well here’s the thing, you didn’t even have the concept of bleen until I told you about bleen, right? Avi: So this is just the difference, I think, between known unknowns and unknown unknowns. So in Prediction Machines, we have a whole chapter framed on Rumsfeld and his discussion of known unknowns and unknown unknowns. Look, sometimes you don’t have a prior on it, and it’s an unknown unknown. That doesn’t mean that it’s not a prediction failure. It was just off the support of your data, and you didn’t know what to do about it. And I think that happens all the time. Seth: Sometimes you find a black swan. Avi: Yes, exactly. And so like, there might be places where humans are better at that kind of prediction than machines. There might be places where both humans and machines are really awful at that kind of prediction. And if that’s the case, then you want to have robust systems to anticipate those kinds of things. And that’s where judgment comes in. Like, if you’re wrong about the existence of a black swan, you know, does that change anybody’s behavior? I think the answer is no, because black swans and white swans aren’t actually that different from each other. But if there were other examples, like financial crises, where he uses the metaphor of the black swan, then absolutely there are meaningful differences. And you should Andrey: Financial crises. Seth: All right, so you’re saying that jobs that will survive TAI number 7 should be Black Swan, anticipator. Andrey: Not an anticipator. Actually Seth, this is actually kind of the key point. The point is, anticipator of whether Black Swan affects your utility enough that you should plan for it. O-Ring Complementarities and Automation [25:22] Andrey: I think next it will be awesome to talk about automation and some O-rings. Actually, the previous episode we did, we reread Michael Kremer’s classic O-ring paper because it’s been so inspirational for so many. It’s a great paper. They don’t write them like this anymore. Seth: It’s so fun to read. They don’t like to do macro like that anymore, unfortunately. Andrey: So we were wondering, so you have your own spin on the O-Ring paper. Maybe you’ll tell, you can tell us a little bit about that. Avi: Paper makes a pretty simple point. There may be two simple points. First one is that when you think about tasks within a job, they’re not interchangeable and substitutable. So it’s not just like, okay, a machine comes in and takes tasks. Sometimes tasks are complements. Now that isn’t, I’m gonna a little cautious. We talk about that in our O-Ring automation paper. It’s not necessarily a new idea. It’s implicit in the constant elasticity models. you can have a Leontief production function. Seth: We’re talking about the Daron-style task-based models. But if you actually read the papers everything immediately goes Cobb-Douglas. It’s always immediately weird. All the tasks are substitutes and then Cobb-Douglas over all the tasks. Avi: Yes, but it’s possible to, within the canonical model, to have that. So our point number one is tasks can be complements. And I just wanted to be cautious because I don’t want to claim that that’s necessarily our idea. But it’s an emphasis maybe that the existing literature hasn’t had. And then the second is, well, once you have tasks that are complements, if a machine starts doing some of those tasks, human can move their attention to the other tasks that are not yet automated. And when that happens, the human gets better at those tasks, which then makes automation of those remaining tasks even harder because the machine has to be better than now the human who’s spending all of their time focused on the remaining few tasks. Skills Versus Tasks [27:40] Seth: So let’s pause right there because I have a couple of questions right there immediately. So one way to think about automating part of your job is you’ve automated part of your job and now I can reallocate to the stuff that’s not automated. also another way to think about tasks within a job that are complementary is to think about them as sort of like innate skills or abilities. So think about the job of being a basketball player. The job of being a basketball player involves being tall and being agile. If you somehow automated being tall, I can’t reallocate my skill points into being agile, right? If we think about my performance as more as a combination of my skills, then automating part of it or taking part of it away, it’s not necessarily obvious to me that I can get better at the thing that’s not automated. Avi: The way we, okay, so first the way the literature usually thinks about jobs is generally at the task level, not the skill level. Okay. So a worker does a bunch of tasks. Okay. Those tasks require skills, but the worker does a bunch of tasks and the A machine comes along and can do the task and not the skill. So I’m not sure what it means for a machine to be tall. What it means for a machine to slam down. Seth: Well, let’s think about being a doctor. Let’s assume you might imagine being a doctor involves bedside manner and judgment about and diagnosis right it’s not clear to me that if you automate my diagnosis I can reallocate more effort into bedside manner some people are just level five at that and some people are level one at that AI Doctors and the Future of Medical Work [29:25] Avi: It is obvious to me that there’s a bunch of tasks in a doctor’s workflow. Some of them involve diagnosis. Some of them involve talking to patients and making the patients feel better. And within those, there are skills in being good at filling in the missing information of what’s wrong with the patient and skills of making the patient feel comfortable. And actually, for some of those tasks, you might even need both. A machine comes along and automates the diagnosis skills. Okay. That means medical professionals are going to be spending more time on the other skills. This is actually an Eric Topol’s deep medicine book. I’m not sure if you’ve read it. It’s, it’s like a pre-ChatGPT, but like how AI might transform medicine. And that is his core thesis. The idea is that AI is going to make healthcare human again, because doctors are going to spend less time looking at screens and focused on diagnosis and more time. interacting with patients and making patients feel better. So in that sense, we get the automation of the diagnosis task and some of the computer tasks that should exactly lead to reallocation toward the human part. But then you brought up something else, which is, do our current doctors, if they spend that much more time interacting with patients, are they the right people for this job? Or alternatively, could we have a different set of medical professionals who we could train because now the machine can do some of those tasks who would be way better than our current doctors at the remaining tasks? I suspect if the machines get good enough at diagnosis and identifying appropriate treatments, there is an enormous opportunity for a new kind of medical professional who is focused on essentially interacting with patients. Seth: Yeah, so you’re making the occupational reorganization point and that’s that’s obviously essential and we’re going come back to that in the second. Yeah, I just I’m just pointing out that maybe maybe my example of basketball wasn’t so good. Maybe my medical example wasn’t so good. But I bet you I could pick out some domains where the elasticity of task output to effort is very inelastic. Avi: Okay, trying to think. You’ve switched from skills to task and that makes me much, much happier. Seth: Well, I mean, you would only need to worry about skills is if you were inelastic to effort, right? Then it’s just the skill. Rare Skills, Common Skills, and Wages [32:04] Avi: So there’s the new Autor and Thompson paper on automation, which I think gets at some of the things you’re talking about, which is if the things the machine does are relatively rare skills, like are tasks that involve relatively rare skills, to be precise, then what happens is we get entry into that profession. More people can do it and very likely wages go down. And if the machine things that the machine does are things that many people can do, they require less specialized skill, then the remaining humans in that job will, there’ll be fewer of them and they’ll likely be higher paid. Seth: Right, think that’s right, but I think maybe a missing component here is within the job already, what is the correlation in abilities between people who are good at the automatable and non- automatable part of the task, right? Avi: Yeah, but I think that’s the statement about that. Like in the short run, we’ll get the Autor and Thompson results. And in the long run, we’ll get a reallocation of jobs, right? There’s a system of professions and the system of professions will change. Are Tasks More Complementary Than Cobb-Douglas? [33:23] Seth: In the long run, you get the reorganization of jobs. Maybe one other thing I want to talk about before we get into reorganization of jobs is just this question about, tasks more complimentary or less complimentary than Cobb Douglas? Do you have a sense of that with tasks within a job? I mean, it seems like would vary a lot, a lot from occupation to occupation. I think we all have this intuition that they should have some kind of complementarity. That’s why they’re a job in the first place. That’s why they’re bundled. But you might bundle them and they still might just be, you know, gross substitutes that have a little bit of complementarity. Avi: I suspect there’s a lot of heterogeneity across jobs and I don’t think we have good data on that yet because sometimes we haven’t been looking because our model is substitute model and so our papers are fundamentally focused on the substitute. Seth: And I think this is an example of somehow the theory is sometimes a little bit downstream of the data, right? We just have so little data on people reallocating effort across tasks within a job that of course it makes sense to aggregate up to just add up all of the tasks done by all of the workers. That’s kind of, that’s my guess of why Acemoglu gets there. Avi: So of the task papers, the Eloundou et al., Dan Rock’s paper, is incredibly careful on every page. Seth: This is not an automation measure. Do not use this to measure automation. Avi: This could be a complement, it could be a substitute. These are just jobs that change. So like kudos to them, the four of them for being super, super careful. Nevertheless, when that paper is cited both in the academic literature and in the press, that idea seems to get lost. I’m not exactly sure why, maybe that’s because of the model. Seth: Question people want to answer, right? The people don’t want to know what job’s going to change. People want to know what job should I get, right? And so... Avi: Well, okay, but if it’s a question people want to answer, then the complements matter just as much as the substitute. I wonder if the answer that people want to know, like the answer that people want, and then they just... Andrey: I actually think it’s I think take has always been that just most people are pretty, they’re very sophisticated users of this data, but a lot of people don’t have a sophisticated economics model. And therefore to them, it’s just obvious that what’s going to happen is the machines are going to take our jobs. As a result, that’s just, they don’t have a more nuanced model of economic activity and therefore that’s how they interpret it. Now there are more sophisticated readers, think, we know some of them, where they’re just really just think that AI is going to be able to do everything in a very short period of time and then it all kind of becomes moot. You know, if you think that every single task can be done by an AI. Why the Impact of AI Was Ambiguous in Earlier Work [36:15] Seth: Yeah. Well, I guess this kind of brings us to your 2019 Journal of Economics paper, which is about where you guys kind of where you kind of throw your hands up. That’s not that’s a positive part and say there’s an ambiguous impact. So I guess I want to push you there on is the ambiguous impact because. We just don’t know all of the relevant elasticities, right? We need to know the elasticity within tasks within a job. We need to know elasticity across jobs within an organization, the elasticity across sectors of demand. And if we could put all of those together, we would be able to answer the question. Or is it more ambiguous than even that? Avi: No, I think you need to understand when that paper was written in order to understand the paper, which is in 2019 or late 2018 when we were writing it, we had no concept of anything but a task- based model with substitutes. Okay, maybe that was on us. We should have. But Acemoglu and Otter and Rastrepo were the dominant- Paradigm. ... working in literature, especially Acemoglu. Seth: Are you saying our ontology was limited? Avi: I’m not exactly sure what you mean by that, but... Andrey: You forgot about the O-ring which was the black swan of papers. Avi: Yeah, yeah. So like, we did. Seth: I mean in Kremer, I mean, presumably you looked at Kremer again before writing your paper. You can almost see he’s almost there. He’s almost at, and this is within workers too. He doesn’t exactly say it. Avi: Exactly. So when we wrote that paper, we were thinking task-based substitution. That was the model that we had. And actually, in the process of writing that paper, in some sense, we learned what was wrong with that model and ended up with, we just don’t know. And part of that is, we wrote it in 2018, 2019. We were looking for new tasks from AI. So this is before ChatGPT, like four years before ChatGPT. So new tasks hadn’t really come up yet. All we had was identifying space junk and treatment for complex disease, which actually wasn’t our idea. It was Tim Taylor’s idea, our editor. Andrey: Well, you already had AlphaFold, right? Avi: Yeah, but it’s not clear what the new task is because of AlphaFold. Yeah, fair enough. In terms of... So, and actually that paper in some sense directly led to our work on system change and GPTs, because Tim Bresnahan pulled me aside that summer at the Summer Institute and told me he hated our GPT paper. I’ve told you guys this before. Because it was a task-based model and that’s not how meaningful change happens. That then led to all this work on trying to understand, well, if it’s not a task-based model, how does the system change? Andrey: Okay. And we’ve covered that to Bresnahan paper on this podcast. Reorganizing Jobs Around AI [39:22] Seth: I guess let’s talk about reorganization of tasks. Obviously that seems to be, that’s the best case answer. The best case answer is you split off the, I guess from the perspective of a firm trying to boost productivity, maybe not necessarily from a worker’s perspective. From the firm’s perspective, you want to slice off the automatable thing, let that rip, and then figure out what you have to leave behind for humans. Is there any good research about... How do you do that? What industries are better than that at others? Like, what’s the next research frontier on that question? Avi: I think you just defined it. there are two. One is like within the firm, how do we think about where the complements are and what’s left for humans and how does that vary across organizations? The second part, and Alex Emas has highlighted this recently, is it also depends on elasticity demand for the... Seth: products. Avi: Like, you know, even if within an organization workers reallocate and they become hard to automate because they’re more productive, but then the organization is producing more, well, someone has to want that more or else then, you know, at least that organization or its competitors are going to to business. Seth: Well it’s factor, well its price will come down, know there’s a kind of a nebulous connection between price and profitability. Avi: Right. Price goes down. It’s got to go down like, well, quantity has to go up enough that we still need the workers. Andrey: There might be a paradox in there that’s not really a paradox. The misnamed Jevons paradox. Avi: Maybe. Should We Want Less Automation? [41:05] Andrey: Following up on this idea, think several prominent economists have called for a government push or ideological push to make AI that complements humans rather than substitutes for humans. Seth: Friend of the show, Erik Brynjolfsson has written about the Turing Trap. Is the Turing Trap misnamed? Is it not a trap? Should we embrace the Turing? Avi: Okay, so this is our science paper. Seth: Let’s get the hot takes. This is where we brought you on. Avi: Do want more automation? Yeah, so Eric has said it. Doron has said it. There’s lots of policy. We should complement humans, not replace them. And John Markoff is a journalist. He has this book called Machines of Loving Grace, same title as Amodei’s essay, essay, but older book. It is about the history of computing. Seth: When you’re a tech billionaire, you’re allowed to use cool phrases unsighted. I’ve noted this. Augmenters, Automaters, and Inequality [42:10] Avi: Well, they’re both referencing a poem. And in Markov’s book, there’s these two streams of computer science. There’s the, I forget exactly how he labels them, but essentially there’s the augmenters and the automaters. And at least from my perspective, the augmenters seem like the heroes of his story. And the automators who start to become prominent as this book is getting written around 2014-2015 Seth: They’re trying to trap us. They’re trapping us. Avi: But we also know that the rise of computing the internet massively increased inequality. They generated enormous wealth, but they massively increased inequality. And I hypothesize that the reason for that is, yes, they were augmenting what humans do, but they weren’t augmenting what all humans do. They were augmenting what a set of humans who are good at abstract thinking do. And those people were already doing pretty well. And so in the process of augmenting humans, right, because no human can do what the internet does or what a computer can do, they augmented folks at the top and left others with relatively stagnant incomes. Seth: Is this story there really at the task level? The way I think about that inequality story is that it’s kind of at the firm level, right? It’s we’ve now put the corner store into competition with Amazon and so Amazon wins and whatever Amazon takes as input wins. Avi: There’s a bunch of different pieces. The one I’m emphasizing is like the Autor, Katz, and Kearney framework, which is about skills. Andrey: I mean, it has to be both, right? There’s a set, right? Like, the humans who are now able to market their unique skills match with the firms that are larger, but you kind of need both to create the inequality or some of the humans become superstars without like needing the firm in first place, right? Avi: I think in principle you could get within firm inequality without getting across firm inequality. We ended up getting both. Seth: Yeah, both. Both happened. Andrey: Fair enough. Avi: but as I’m thinking like Autor, Katz, and Kearney with computing and then Shane Greenstein, Chris Foreman and I have some work on sort of the internet inequality, same kind of idea. so on the other hand, automation technology, if it’s automating things that folks at the top do, could superpower everybody else. Okay. And this is a could, cause we hasn’t really happened. So what we hypothesize, so the question, the paper is called, Do We Want Less Automation? And our answer isn’t no. Our answer is, here are reasons why it’s not obvious. Okay? It’s very economist-like. And the essence of it is, we were just talking about this medical example. Well, if what doctors are paid for is 10 years of post-secondary schooling, that essentially is about prediction, diagnosis and treatment. Then someone potentially with two to four years of post-secondary schooling who was much better at managing patient stress and all these other things, training like a social worker, combined with a diagnosis machine could be super hard. And so their productivity goes up. And there’s a bunch of industries where What people at the top do seems a lot like filling in missing information. Are Intellectuals Giving Biased Advice About AI? [45:58] Seth: One might even cynically say that these thought leaders who have been so augmented by the internet are maybe not giving the populace the best advice. Avi: Maybe. So I had an undergrad RA write an essay for me. She’s a philosophy major. you know, a couple summers ago, it’s Amelia Agarwal. I feel like I should call her out. Seth: Love undergraduate research on the pod. Avi: Yeah, the opening of her essay was, part of her assignment was to read and hear about all these people who said AI is going to automate work. And so I’m going to have to have leisure, like essentially. And she’s like, that doesn’t strike me as bad. And then she dug into it and her framing was essentially the people whose identity was driven by their, you know, intellectual abilities, public intellectuals are exactly the people most threatened by AI. And so anyway. Andrey: You know, it’s very interesting. I actually disagree. Yeah, I think lots of intellectuals are threatened by AI but not public intellectuals and that’s because humans are going to want other humans to communicate to them in many ways. So, the role of the public intellectual is not going to go away. The role of the maybe the scientist toiling away on their research. That is in my opinion much more a threat. if you’re... one might even deduce that Seth and I have started this podcast as a hedge for that world. Seth: Well, what I say is as the price of writing papers goes down, the return to reading papers goes up. But maybe this goes back to the taste idea, right? Which is one way you might think of taste is a public intellectual doesn’t let’s let’s be cynical for a minute. The public intellectual, the public art critic doesn’t actually know art better than anybody else, but they serve a role as a coordination mechanism. Right. Everybody trusts Andrey. So when Andrey points at the thing and says it’s good, everybody converges to that. And then maybe that’s one notion of taste that will be preserved. Avi: Yes, and so you started in science and moved to art. There’s probably differences between them, but in the sciences, there’s a question, or a scholar’s, what’s our goal? What are we trying to accomplish? And I think different disciplines have different goals. And depending on the goal, the role of the human curator changes. If the goal is so that humans understand the world, and have sort of a consistent model, then there’s a real role for a curator. If the goal is to build a better spaceship, then maybe there’s not such a role for a curator. And so I haven’t been following that literature, so I don’t know really what the formal academic take on what I just described is. Can Policy Steer AI Toward Augmentation? [49:27] Andrey: Yeah, I agree. I haven’t seen much formalization. So listeners, if you know of any, send it along. Yeah, I mean, I sorry, I just want to make a final point is that I think I like your criticism of this augmentation idea. But to me, there’s like a much deeper criticism, which is there’s there’s just kind of a whiff of central planning involved in it. like, how how do you know? What technologies are going to automate versus augment. Like this is very hard to predict in my mind. And to think that the government is going to like somehow implement a system of taxes on technologies that are augmentation versus substitution, it’s ridiculous in my opinion. Avi: So I was taking as given that you can understand what is automation and what’s augmentation. I agree it’s a very hard challenge. There, I think the narrative, I’m gonna be careful. I think the argument is if even without choosing winners, we might be able to tax capital relative to labor or something like that. in order to push things in a particular direction. I think that’s it. Andrey: Yeah, that’s the most plausible. Seth: That’s pretty plausible, but when you actually hear versions of the Turing Trap articulated, it’s really like go and burn down the houses of the people who want to automate you. Avi: Okay. So Korinek and Stiglitz have a chapter that’s really about tax and capital that’s in our economics of AI book. And I think like the Acemoglu Johnson argument is really about tax and capital. I’m not enough of a macro economist to have a strong opinion about one way or the other, but that I agree seems more Seth: Right, and then there’s a deeper, deeper argument there about whether or not you want to tax capital, right? There’s the old Chamley-Judd result about, well, know, labor is inelastic and capital is elastic, so really you don’t want to tax it. There’s obviously international considerations about if you have a fully automated technology, isn’t that just going to locate itself in the lowest tax jurisdiction? And so it might be very hard to tax capital. And then of course the Iván Werning follow-up research kind of complicating the original Chamley-Judd results. So this gets in the weeds really fast. Andrey: And it’s also very blunt in many ways, right? A lot of capital is not about automation. it’s a... I don’t know. Avi: Yeah, and there’s all sorts of questions in public finance and how that all plays out to like the there’s under the names Trammell and Korinek. I think it’s Trammell. No, it’s not. Andrey: That’s Lockwood. Avi: Lockwood and Korinek, thank you. have a relevant paper there. AI Growth Scenarios Through 2030 [52:36] Andrey: Next topic. Yeah. So there was a very well-circulated survey of economists about their expectations of economic growth in different AI scenarios. Seth: Now Avi, I understand you have intentionally not read this so as to have an unbiased take, so you will not be contaminated by the opinions of everyone else. Is that right? Avi: That is absolutely right. Andrey: Excellent. You’re definitely not in the same university as many of the authors. Avi: I probably will, but we’ll see. Andrey: All right. So the first conceit is that there are three scenarios for AI progress that they want us to consider. The first one is slow progress, where by the end of 2030, the AI can do PhD student level assistance, half of eight hour long coding tasks, passable stories and songs. Robotics navigate homes with some help. So that’s kind of the slow. Moderate is you have semi-autonomous labs, five-day coding tasks, high-quality novels and hit songs. Robotics can perform basic tasks. And then rapid progress outperforms top humans in research coding and leadership, award-winning creative works, nearly all physical tasks. So those are the three scenarios by 2030. So the first question is, how do you allocate the probabilities between slow, moderate, and rapid by 2030? Avi: So, okay, so with the exception of the statement about hit songs and award-winning, those are all about the models and not about the outcomes. So I’m going to ignore the hit song and award-winning part because I think that’s... Andrey: It’s of the quality of the quality that could win it. Avi: Okay, because at a high level, what I think is the technology is going to accelerate rapidly, but there are all sorts of meaningful barriers to widespread diffusion and having an impact on the economy. and sometimes I think we’re already in the slow and for aspects of the medium versus the fast, I feel like I should call it 50-50 because I’m skeptical of the like, I’m skeptical of the robotics stuff, but the five day coding task seems very, likely. And so just. Andrey: Yeah, there’s some other things. CEO level agency, you know, like is is one of the criteria. Seth: I don’t know whether or not they can run a vending machine. Avi: But don’t like part of it. So much of what a CEO does is like is charisma and creating followers, right? And I’m not sure that’s a mission. Seth: Is it charisma judgment task? Is it charisma judgment? Avi: It’s a skill. I’m not sure it’s a prediction or judgment. It’s more like an action. Andrey: Yeah. But okay, fair enough. Just to give you like a sense of where economists came in and they took this in the fall, 39 % that were still in slow by 2030, 47 % that were in moderate and 14 % then were in rapid. So you are more bullish than a typical economist. Avi: I’m more bullish. I probably shouldn’t have said zero for slow. In retrospect, I was just going to be something five to 10 or something like that. GDP Growth by 2050 [56:22] Andrey: Okay, great. Now, and I think this is the question that really there was a lot of controversy about. So, the question was, by 2050, what is the annual change in GDP on average? Avi: GDP or GDP per capita. Andrey: This is GDP. Avi: I like I have to make a population assumption. somewhere between two and 3%. Andrey: All right. You are well within the economists’ answer here: 2.5%. Avi: duplicate. And so we’ll be a little above that. Andrey: So 0.5%, that’s all we get. okay. Extra from AI over and above. Avi: Well, no, I don’t think you want to say that because the reason we have 2 % is because of innovation in past. Andrey: Okay, so fair. I agree, I completely agree with you. Avi: Like it’s possible, especially with, you know, it’s possible we would have gotten zero. Seth: 5 % better than historical rate of technological growth. Avi: Yes, something like that. Andrey: Now, what if you were for sure, what if you for sure knew we were in the fast scenario by 2030? How would that like change your predictions? Seth: It’s hard to get to above three. Avi: Like, yeah, I just think there’s a lot of bottlenecks in the economy. I think that, and we’re going to figure out what they are. Seth: We’re gonna find out fast and that guy is gonna be rich. Avi: Yes. Andrey: So you’re once again, like a very down the median economist. Avi: On growth. Yeah, okay. Seth: Can I ask you, you think that’s mostly about bottlenecks? You don’t think that’s mostly about people taking leisure? Avi: I think it’s mostly about bottlenecks. What Are the Bottlenecks? [58:36] Seth: So gun to your head, what’s the biggest bottleneck in that high growth robots are awesome scenario. Avi: I feel like my best answer is we’ll find out. Andrey: Okay. I guess the pushback that folks gave is this is a scenario where by 2030 robots can do nearly all home and industrial tasks and faster than humans, right? So you might say, well, manufacturing and physical tasks are a tiny, not tiny, but they’re not that big of a portion of the GDP already. maybe- Avi: be essentially zero is the point. If they’re that efficient and that cheap, then they won’t mean like, I guess it depends on how we calculate the deflator. agriculture is way more productive. GDP hasn’t grown by that much. Andrey: But what if we have, you know, you know, robot doctors that can do, you know, like, Avi: Great, then medicine will be cheap. It’ll be less of GDP. Andrey: I guess, all right, so here’s a hypothetical. Here’s a hypothetical. Let’s say we had a cure for cancer as a result of this, which is very plausible in the rapid scenario, and that we also, at least in principle, have the technologies to administer it through robots very efficiently because we are in a world of just true abundance. My sense is that people would value that medical care extremely highly. And if one were to properly deflate the existing cost of cancer treatment, wouldn’t that imply a very large GDP effect? Now you can say maybe we’re not going to calculate that correctly. GDP, Consumer Surplus, and Health Breakthroughs [1:00:25] Avi: Now I feel like I’m going to, you know, it’s sort of the Bob Gordon sense. I don’t think we deflated antibiotics properly. I don’t think we deflated flush toilets properly. So if you’re talking about consumer surplus, then maybe consumer surplus will be found, especially, you know, to the extent that it’s health outcomes, then huge increase in consumer surplus, much more than the argument that we’ve had for digital. Because the that debate on whether digital really made us better compared to what was happening in the 20th century, I reasonable people can be on both sides of that debate. what you’re describing, is can’t secure people living wonderfully and healthy to 100, there might be some limits to how long, but that would be wonderful and great for consumer surplus. But if that happens, I guess it might and it’s that easy, it might become so cheap that it’s it’s like agriculture. Because food is pretty essential too. And food is so cheap that we don’t worry about it so much anymore. Seth: Inelastically demanded. think people will elastically demand years of life in a way that they won’t elastically demand calories, right? Avi: Potentially. Seth: You think people will get sick of it. I thought you were to go to maybe you’ll recall in Doron’s simple macro economics of AI, a favorite paper of this podcast. He actually predicts that actually consumer surplus might raise by less than is implied by the GDP growth rate, because we’ll invent evil jobs like social media manipulator. Do you are you still convinced that consumer surplus growth will be faster than GDP growth evolves? Or are you open to this idea of the invention of evil tasks? Avi: I feel like we are not in my expertise. Seth: Turn it up. Andrey: Seth is really trying to get the hot takes. Avi: I don’t like to judge what particular products, a particular. Seth: Well, you can’t judge, you can’t predict. Avi: Yeah, you know, what am I in a- Andrey: Then you become a economist. Avi: Actually, let me give... So I think it’s reasonable for people to say some roles, some jobs, some products are better than others. I don’t think that has a meaningful role in GDP calculation. And I also worry if in our consumer surplus calculations, we economists say some things are better and some things are worse because then... So much of it is just obviously to the taste of the... Seth: It’s such a normative can of worms, right? GDP we can measure, consumer surplus. I mean, we do things at the Stanford Digital Economy Lab around trying to do willingness to accept experiments, but obviously those are highly limited too. Avi: So consumer surplus as in figuring out the area under the demand curve, that’s the kind of task I think we’re good at. It’s within our domain. whether the demand curve is morally right or wrong, that’s not something I’m going to be finding out this day. Andrey: I wanted to just like close off that loop a little bit by just saying that you just gave me an answer that said that for our evaluation of how good of a world we’re gonna get in 2050, GDP is no longer the correct sufficient statistic, which obviously makes me question like why is this such a bench? Why are people so interested in forecasting GDP in 2050 if we think it’s going to get pretty uncoupled with consumer surplus in these scenarios? Avi: Well, I’m not sure it’s more or less uncoupled than it has been in the past. I think reasonable people can disagree on that. I think the debate between Bob Gordon and Erik Brynjolfsson or Bob Gordon and others over the years is sort of is really informative about how hard it is to say, you know, what’s better versus today versus the past. What happened in the early 20th century is pretty amazing. okay, that’s point one. Point two is it’s not obvious to me that GDP like GDP tells you your national capacity. That’s what it tells you. Seth: That’s useful for things like wars and public finance. Avi: If I remember my first year econ, haven’t taught first year econ for a long time. That was the idea. What’s the industrial capacity of the country? Or what’s the economic capacity of the country? It turns out it’s highly correlated, as I understand it, with lots of welfare measures. You guys know this. And so we use it for that. Once you start deviating, then... then that’s fine, but you’re now embedding a whole other set of values. At least with GDP, we know what the values are. It’s not it’s not value laden, but we at least know what the values are that we’re embedding in that measure. Andrey: But guess I’m not sure we know, just in many conversations with economists, this question of deflators has come up and most of us haven’t spent much time thinking about what actually goes into that and how well that’s don

4. touko 2026 - 1 h 20 min
Loistava design ja vihdoin on helppo löytää podcasteja, joista oikeasti tykkää
Loistava design ja vihdoin on helppo löytää podcasteja, joista oikeasti tykkää
Kiva sovellus podcastien kuunteluun, ja sisältö on monipuolista ja kiinnostavaa
Todella kiva äppi, helppo käyttää ja paljon podcasteja, joita en tiennyt ennestään.

Valitse tilauksesi

Suosituimmat

Premium

  • Podimon podcastit

  • Ei mainoksia Podimon podcasteissa

  • Peru milloin tahansa

14 vrk ilmainen kokeilu
Sitten 7,99 € / kuukausi

Aloita maksutta

Premium

20 tuntia äänikirjoja

  • Podimon podcastit

  • Ei mainoksia Podimon podcasteissa

  • Peru milloin tahansa

14 vrk ilmainen kokeilu
Sitten 9,99 € / kuukausi

Aloita maksutta

Premium

100 tuntia äänikirjoja

  • Podimon podcastit

  • Ei mainoksia Podimon podcasteissa

  • Peru milloin tahansa

14 vrk ilmainen kokeilu
Sitten 19,99 € / kuukausi

Aloita maksutta

Vain Podimossa

Suosittuja äänikirjoja

Usein kysytyt kysymykset

Lisää kysymyksiä & vastauksia
Aloita maksutta

14 vrk ilmainen kokeilu. Kokeilun jälkeen 7,99 € / kuukausi. Peru milloin tahansa.