Breakthrough AI Operators
Open source AI models are now just 3-5% behind the best closed source models on benchmarks — about six months of lag time, not five years. If you're building an AI infrastructure company on the assumption that OpenAI or Anthropic will maintain a permanent lead, your moat is disappearing faster than your revenue projections assume. Most founders at the $3M–$20M stage are still over-indexed on model selection and under-indexed on inference economics. They're obsessed with training costs and model access, but the real cost explosion is coming from running models at scale. A model that trains for a year but only runs for a month is a terrible investment — and yet that's how most AI budgets are still structured. Nikola Borisov spent a decade building backend infrastructure for a chat app with 200 million monthly users before launching Deep Infra. He's CEO and co-founder of Deep Infra, an AI inference platform that owns its own GPU clusters and serves as one of the largest token suppliers on OpenRouter. The episode centers on two bets Nikola made that most infrastructure founders won't: first, that open source models would catch up to closed source models faster than anyone expected, and second, that inference — not training — would dominate AI budgets within five years. Those bets are both paying off. The gap has narrowed to 3-5%, and as Deep Infra lowers costs, customers aren't just consuming more tokens — they're jumping to better, bigger models. The conversation also surfaces a less obvious pattern: the economics of AI inference mirror the economics of CDNs more than they mirror cloud compute. Walmart and Target don't care if their images are served from the same CDN — it's just an efficient way to deliver content. Deep Infra runs the same model for multiple companies in parallel on the same GPUs, and neither company cares. It's neutral infrastructure that scales horizontally without requiring every company to build their own. Roland sees this pattern constantly in his advisory work with SaaS companies scaling from $1M to $50M: founders are modeling their AI spend around closed source API access and per-token pricing, but they're not accounting for what happens when open source closes the gap and inference costs drop 20x. The companies that move early to open source inference infrastructure will have a cost structure their competitors can't match in 18 months — and cost structure at scale is the actual competitive wedge, not model access. Key Moments: 3:01 — Why the gap between closed source and open source models has narrowed to 3-5% — and what that percentage actually measures 5:00 — The five-year-old explanation of inference: training is school, running the model is work 6:41 — Why Anthropic's compute conflict (training vs. serving customers) reveals the real economic wedge 10:39 — The CDN analogy: why Walmart and Target don't care if their requests run on the same infrastructure 16:12 — How lowering costs changes customer behavior — they jump to bigger models, not just more tokens 18:51 — Why Nikola believes inference will dominate company budgets in 5-10 years 20:29 — What a math Olympiad medalist and programming competitor learned about certainty that still drives how he builds 22:31 — Nikola's advice to younger founders: focus on what's most important today, not what's interesting --- If navigating AI infrastructure economics — balancing model access, inference costs, and long-term vendor lock-in — is something you're working through right now, the Midstage Accelerator helps SaaS founders at the $1M–$50M stage model these decisions with real unit economics and stage-specific benchmarks. mdstg.ac/drag-erase #AIInfrastructure #OpenSourceAI #InferenceEconomics #SaaSScaling #ScalingWithoutBreaking
23 episoder
Kommentarer
0Vær den første til at kommentere
Tilmeld dig nu og bliv en del af Breakthrough AI Operators-fællesskabet!