YPO Technology Network AI Brief
The strategic AI question is no longer "which model do we use." It's "where does the model run, and who pays for the tokens." This week the AI inference startup Baseten raised roughly $1.5 billion at up to a $13 billion valuation for the unglamorous business of running other companies' models. Meanwhile token prices are collapsing about 10x a year, and enterprise AI bills are going up anyway. In this episode, Stephen Forte unpacks the inference economy and what it means for your business: * The inference gold rush — why investors value the company that runs models more than many that build them, and why inference is 80-90% of a model's lifetime cost. * The land grab — Amazon selling its Trainium chips to challenge Nvidia, and Alphabet's $84.75 billion raise to fund AI capex. * The pricing paradox — "LLMflation" makes tokens ~10x cheaper a year, yet the Jevons paradox and the new "thinking tax" of reasoning models send total bills higher. * The counter-move — open-weight models running locally on your own hardware, Apple's new "zero token cost" Core AI, and how to think about cloud vs. local as a cost-structure decision. * Two concrete moves for the quarter — build multi-model routing, and budget for usage growth, not the falling unit price. Sources: * Baseten ~$1.5B round at up to $13B — TechCrunch [https://techcrunch.com/2026/06/18/ai-inference-startup-baseten-reportedly-raising-1-5b-months-after-its-last-mega-round/] * Amazon to sell Trainium chips externally — TechCrunch [https://techcrunch.com/2026/06/18/amazon-hopes-to-challenge-nvidia-more-directly-by-selling-its-ai-chips/] * Alphabet $84.75B equity offering for AI — Intellectia [https://intellectia.ai/news/stock/alphabet-announces-8475-billion-equity-offering-to-fund-ai-investments] * LLMflation and falling inference costs — a16z [https://a16z.com/llmflation-llm-inference-cost/] * Jevons paradox and rising enterprise AI spend — GUUTs / FinOps [https://guutit.com/the-ai-inference-cost-crisis-part-5-the-jevons-paradox-why-cheaper-inference-wont-save-you/finops/] * Apple Core AI at WWDC26 — Let's Data Science [https://letsdatascience.com/news] The AI Brief from the YPO Technology Network is a daily executive briefing on the AI developments that matter to business leaders. Hosted by Stephen Forte.
92 episodios
Comentarios
0Sé la primera persona en comentar
¡Regístrate ahora y únete a la comunidad de YPO Technology Network AI Brief!