Cheaper Tokens, Bigger Bills

8 min · Ayer

Descripción

The strategic AI question is no longer "which model do we use." It's "where does the model run, and who pays for the tokens." This week the AI inference startup Baseten raised roughly $1.5 billion at up to a $13 billion valuation for the unglamorous business of running other companies' models. Meanwhile token prices are collapsing about 10x a year, and enterprise AI bills are going up anyway. In this episode, Stephen Forte unpacks the inference economy and what it means for your business: * The inference gold rush — why investors value the company that runs models more than many that build them, and why inference is 80-90% of a model's lifetime cost. * The land grab — Amazon selling its Trainium chips to challenge Nvidia, and Alphabet's $84.75 billion raise to fund AI capex. * The pricing paradox — "LLMflation" makes tokens ~10x cheaper a year, yet the Jevons paradox and the new "thinking tax" of reasoning models send total bills higher. * The counter-move — open-weight models running locally on your own hardware, Apple's new "zero token cost" Core AI, and how to think about cloud vs. local as a cost-structure decision. * Two concrete moves for the quarter — build multi-model routing, and budget for usage growth, not the falling unit price. Sources: * Baseten ~$1.5B round at up to $13B — TechCrunch [https://techcrunch.com/2026/06/18/ai-inference-startup-baseten-reportedly-raising-1-5b-months-after-its-last-mega-round/] * Amazon to sell Trainium chips externally — TechCrunch [https://techcrunch.com/2026/06/18/amazon-hopes-to-challenge-nvidia-more-directly-by-selling-its-ai-chips/] * Alphabet $84.75B equity offering for AI — Intellectia [https://intellectia.ai/news/stock/alphabet-announces-8475-billion-equity-offering-to-fund-ai-investments] * LLMflation and falling inference costs — a16z [https://a16z.com/llmflation-llm-inference-cost/] * Jevons paradox and rising enterprise AI spend — GUUTs / FinOps [https://guutit.com/the-ai-inference-cost-crisis-part-5-the-jevons-paradox-why-cheaper-inference-wont-save-you/finops/] * Apple Core AI at WWDC26 — Let's Data Science [https://letsdatascience.com/news] The AI Brief from the YPO Technology Network is a daily executive briefing on the AI developments that matter to business leaders. Hosted by Stephen Forte.

Comentarios

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de YPO Technology Network AI Brief!

Prueba gratis

Todos los episodios

92 episodios

The AI Jobs Story Just Flipped

Three second-order effects of the AI buildout are landing on business leaders at the same time — on your people, on what gets built next, and on who's allowed to use any of it. In this episode, Stephen Forte covers: * The AI-jobs story flips — Gallup finds tech workers who rarely use AI are about 3x more likely to be laid off (~18% vs 6%), while Forrester says 55% of companies that restructured around AI now regret it and Gartner expects half of AI-driven cutters to rehire by 2027. Plus Stephen's own playbook: why one-on-one, workflow-specific training beats lunch-and-learns every time. * Capital rotates to world models — General Intuition (~$300M at ~$2B, having turned down a ~$500M OpenAI offer) and Odyssey ($310M at $1.45B, optimizing for Amazon's Trainium chips) both raise nine-figure rounds days apart, betting on AI that understands the physical world. * The rules harden — JPMorgan and Goldman restrict Claude for overseas staff while a bipartisan bill moves to mandate government vetting of frontier models. The era of self-policing AI safety is ending. Sources: * AI fluency vs. layoff risk — Gallup [https://www.gallup.com/workplace/711287/workers-continue-report-downsizing.aspx] * General Intuition ~$300M at ~$2B — TechCrunch [https://techcrunch.com/2026/06/18/general-intuition-in-talks-to-raise-300m-at-around-2b-valuation/] * Odyssey $310M at $1.45B, Trainium-optimized — TechCrunch [https://techcrunch.com/2026/06/17/world-model-maker-odyssey-nabs-1-45b-valuation-backed-by-amazon-and-other-big-names/] * JPMorgan/Goldman restrict Claude overseas — US News [https://money.usnews.com/investing/news/articles/2026-06-18/jpmorgan-chase-cuts-off-anthropic-access-for-its-hong-kong-staff-ft-reports] * Gottheimer frontier-model vetting bill — Politico [https://www.politico.com/live-updates/2026/06/17/congress/gottheimer-readies-ai-bill-to-vet-powerful-ai-models-for-risk-00966586] The AI Brief from the YPO Technology Network is a daily executive briefing on the AI developments that matter to business leaders. Hosted by Stephen Forte.

23 de jun de 20269 min

Cheaper Tokens, Bigger Bills

Ayer8 min

The Model Is Not the Moat

A weekend deep dive away from the news cycle. The question underneath this week's "who controls AI" headlines isn't the supplier's question — it's yours: if every company on earth can buy the exact same foundation model you can, where does durable advantage actually come from? Efficiency from "using AI" is real but not durable, because everyone gets it. This episode braids three expert frameworks into one CEO thesis — the model is the commodity; the moat is everything you build around it. Benedict Evans (independent tech analyst, on Lenny Rachitsky's newsletter): we're in the "1997 phase" of AI — "as big a deal as the internet or mobile, and only as big." When software gets trivially easy to build, distribution becomes the moat, and the right workforce question is "task or job?" not "what percent can AI do?" Dr. Wael Salloum (MIT Technology Review): advantage isn't model access — it's owning the operating layer, capturing every expert correction into compounding, proprietary judgment your competitors can't buy. Ethan Mollick (Wharton): efficiency creates no lasting edge; durable advantage needs a "crowd and lab" — empower employees to experiment, and a small team to scale what works. Culture is the bottleneck, and the CEO sets it. The synthesis: distribution, a compounding feedback loop, and an experimentation culture are three walls of the same fortress — and the model is just the standard brick everyone buys from the same yard. Three things to do Monday: map where you own vs. rent distribution; instrument one decision loop to capture expert corrections; and stand up a crowd-and-lab rhythm that rewards the reinventors. Sources * Lenny's Newsletter — Benedict Evans on where AI is actually going [https://www.lennysnewsletter.com/p/a-rational-conversation-on-where] * MIT Technology Review — Treating enterprise AI as an operating layer [https://www.technologyreview.com/2026/04/16/1135554/treating-enterprise-ai-as-an-operating-layer/] * Ethan Mollick — The frontiers of corporate innovation [https://neuron.expert/news/the-frontiers-of-corporate-innovation/16204/en/] Hosted by Stephen Forte. The YPO Technology Network AI Brief — daily AI news for CEOs and senior business leaders.

20 de jun de 202613 min

The Transformer's Author Just Defected

The week that asked who controls AI ends by zooming all the way in — to the individual. On June 17, Noam Shazeer, co-author of the 2017 paper that introduced the transformer (the architecture under ChatGPT, Gemini, and Claude) and co-lead of Google's Gemini, announced he is leaving Google for OpenAI — less than two years after Google paid a reported $2.7 billion to bring him back from Character.AI. Peers call it the most significant AI talent move of the year, and the lesson for leaders is sharp: in a field where the scarcest input is talent, retention of your two or three irreplaceable people is a board-level risk, not an HR matter. The AI-jobs story also flipped twice. New Gallup research finds US tech workers who use AI less than monthly are about three times more likely to have been laid off than at-least-monthly users (~18% vs 6%), even though only ~1% of laid-off workers name AI as the reason — AI fluency has quietly become baseline job security. At the same time, Forrester found 55% of companies that restructured around AI now regret it, and Gartner projects half of AI-driven job cutters will rehire by 2027. Fund the upskilling before the restructuring, and be skeptical of any AI case whose entire ROI is a headcount line. And the money rotated toward AI that understands the physical world: world-model startup Odyssey raised $310M at a $1.45B valuation (Amazon, AMD, GV), optimizing for Amazon's Trainium chips rather than Nvidia — a quiet crack in the Nvidia-only era. We close with Ben Thompson's Stratechery argument that the AI labs' safety posture is also their commercial moat: the controls justified by safety conveniently gather your data, keep the lab in your workflow, and slow rivals. Control is the product — so evaluate frontier labs as partners who are also potential competitors. Sources * Bloomberg — Star Google researcher jumps to OpenAI [https://www.bloomberg.com/news/articles/2026-06-18/star-google-researcher-jumps-to-openai-in-coup-for-startup] * Gallup — U.S. Workers Continue to Report Downsizing [https://www.gallup.com/workplace/711287/workers-continue-report-downsizing.aspx] * TechCrunch — Odyssey nabs $1.45B valuation for world models [https://techcrunch.com/2026/06/17/world-model-maker-odyssey-nabs-1-45b-valuation-backed-by-amazon-and-other-big-names/] * Stratechery — Anthropic's Safety Superpower [https://stratechery.com/2026/anthropics-safety-superpower/] Hosted by Stephen Forte. The YPO Technology Network AI Brief — daily AI news for CEOs and senior business leaders.

19 de jun de 20269 min

Who Controls AI Just Got Three Answers

All week the question was who controls AI. Today it got three answers, and none of them is "the market." First: four days after the largest IPO in history, SpaceX agreed to acquire Anysphere — maker of the AI coding tool Cursor — for $60 billion in stock, folding a leading agentic coding product (reportedly ~$2B in annual recurring revenue) into the same house as xAI's Grok models and Colossus supercomputer. If your engineers live in Cursor, your core development tool now sits inside SpaceX and xAI — a vendor-concentration question worth asking out loud. Second: Chinese lab DeepSeek closed its first external round, more than $7.4 billion at a valuation north of $50 billion, on mostly domestic capital and structured so founder Liang Wenfeng keeps full control. While Washington restricts who may use US models and Paris rips out US software, Beijing is funding a fully independent frontier champion — sovereignty expressed as a cap table. Third: the US Department of Justice intervened in a Clean Air Act suit to argue that xAI should keep running the 57-plus unpermitted gas turbines powering its Memphis-area data center, because Grok supports Department of War operations — classifying one company's compute as critical national infrastructure worth overriding pollution law to protect. The durable lesson: power, not chips, is now the gating constraint on AI scale, and the politics of who gets to build and energize data centers is turning combative. Sources * NYT — SpaceX to buy Cursor maker Anysphere for $60B [https://www.nytimes.com/2026/06/16/business/spacex-cursor-aquisition-ipo.html] * WSJ — DeepSeek becomes China's most valuable AI startup [https://www.wsj.com/tech/ai/deepseek-becomes-chinas-most-valuable-ai-startup-after-over-7-4-billion-fundraise-78ef64c0] * The Verge — DOJ: xAI's gas-powered data center is necessary for national security [https://www.theverge.com/policy/950722/xais-gas-powered-data-center-is-necessary-for-national-security-doj-argues] Hosted by Stephen Forte. The YPO Technology Network AI Brief — daily AI news for CEOs and senior business leaders.

18 de jun de 20269 min

Cheaper Tokens, Bigger Bills

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios