How Tavily Got to 1M Users Without Sending One Cold Email

22 min · 23. juni 2026

Description

Tavily hit one million users without a single outbound cold email. Not because of a viral moment, but because the founder understood earlier than most that the unit economics of agentic search are fundamentally different. When you’re returning context to an AI agent rather than links to a human, token waste, latency, and information density become the variables that determine whether your product survives in production. Tavily built its architecture around that constraint from day one, and developers noticed. After being mentioned in 200+ research papers and 1,000+ YouTube videos, Fortune 500 enterprises were already running Tavily internally before a sales conversation ever happened. In this episode, Markiesha Patrice (Head of Startup Platform & Community at Nebius) sits down with Rotem Weiss, Founder and CEO of Tavily, to get into the technical and strategic decisions behind building the search layer for the agentic web. Rotem breaks down why optimizing for snippet retrieval over link retrieval changes the entire architecture of a search product, how Tavily's prediction engine pre-processes likely queries to cut latency without sacrificing accuracy, and why he believes the most important GTM decision an AI infrastructure company can make in 2026 is getting integrated into the coding assistants autonomously selecting tools on behalf of engineers. 0:00 - Intro teaser: the buyer isn't a human anymore 0:25 - Guest intro & what Tavily does 0:47 - Rotem's origin story & the first check 1:40 - The real-time data gap & launching GPT Researcher 4:30 - What is agentic search? Snippets, not links 6:55 - The prediction engine that kills latency 7:55 - Why 2026 is the year of token optimization 10:05 - Inside the Nebius acquisition 11:25 - The team & the "second CEO" 12:45 - "The buyer is the AI assistant" — winning coding tools 14:30 - GPUs as rocket fuel 16:30 - One million users, zero cold emails 19:20 - "I'm the worst at everything in the company" 20:30 - Israel vs. the US founder mindset What You'll Learn -Why returning the right snippet rather than the right link changes the entire architecture of a search product. -The prediction engine Tavily uses to pre-process likely queries at the website level, reducing processing overhead without sacrificing accuracy. -Why coding assistants autonomously selecting tools in a stack are becoming the real GTM lever for AI infrastructure companies. -The open-source to enterprise conversion motion Rotem actually used. -How the migration from large closed-source models to fine-tuned open-source models changes the token economics for AI companies trying to get to production. Learn more at: https://nebius.com/startups/podcast

Comments

Be the first to comment

Get Started

All episodes

22 episodes

The End of Monolithic AI and What Comes Next

What if AI’s next breakthrough isn’t a larger model, but a different way of thinking about intelligence? In this episode, Markiesha Patrice, Head of Startup Platform & Community at Nebius, sits down with Sudip Roy, co-founder and CTO of Adaption. After helping build AI infrastructure during the scaling era at Google and Cohere, Sudip now argues that the industry’s future won’t be defined by ever-larger models alone. He explains why leading foundation models are converging, why enterprises should think in terms of AI systems instead of individual models, and how continuous adaptation could become the next major shift in artificial intelligence. 0:00 - Intro teaser: The days of monolithic AI are over 0:47 - Challenging scaling laws: how Adaption Labs started 3:18 - The three pillars: data, continual learning, interfaces 6:52 - Adaptation is all you need: the future of AI models 8:40 - Proportional compute and the AI energy gap 11:00 - Solving the 5% reliability gap for enterprises 11:57 - Advice for builders: take contrarian positions 14:36 - Adaptive Data: synthetic data in 240+ languages 15:19 - Outro What You'll Learn: -Why leading foundation models are beginning to converge and what that means for the future of AI. -Why thinking about tasks instead of models changes how enterprises should build AI systems. -Why enterprise data remains AI’s biggest untapped resource.-How continuous learning closes the 5% enterprise reliability gap caused by domain-specific organizational data. -Why challenging industry consensus can create breakthrough startup opportunities. Learn more at: https://nebius.com/startups/podcast

14. juli 202618 min

How to Build a Consumer AI App That 10 Million People Can't Put Down

Henry Wang built a consumer app that 10 million people use for 150 minutes a day. For most of that time, the assumption in the industry was that you needed a frontier model to deliver a great AI experience. Henry found a different answer.In this episode, Joshua Liss, Head of Media and Entertainment at Nebius, sits down with Henry (Lifan) Wang, Co-founder of Kaon Labs, to get into how you actually scale a consumer AI product. Not just find product-market fit, but build the infrastructure that lets you serve tens of millions of users, test 100 models a week, and generate a trillion tokens a day without it costing you everything. Henry explains why personalization at consumer scale isn't a recommendation system anymore, how Kaon Labs replaced two-week A/B tests with Elo ranking that resolves in two hours, why entertainment is the right beachhead for consumer AI right now, and what in-experience advertising could look like. He also shares his contrarian take on what actually separates the consumer AI companies that break out from the ones that spike and die. 0:00 - Intro teaser: 10M users, 115 minutes a day & 1 trillion tokens 0:20 - How Kaon Labs built AI infrastructure for 10 million users 3:02 - Why AI roleplay is changing entertainment forever 5:10 - Hyper-personalization vs. recommendation algorithms 8:18 - Why scaling consumer AI is harder than enterprise AI 11:33 - Building and evaluating 100 AI models every week 16:08 - Monetizing AI with subscriptions & immersive advertising 19:43 - Rapid fire: The AI-native leaders every startup needs 22:08 - Lessons from building a 10 million user consumer AI company 25:22 - Outro What You'll Learn: -Why testing 100 models a week against real users is the core of Kaon Labs' competitive advantage. -How Elo ranking replaced A/B testing: evaluating models in hours, not weeks. -Why AI replaces recommendation systems with true personalization. -Why entertainment is the ideal proving ground for consumer AI. -Product Market Fit matters, but scale is what separates breakouts from blips. -What in-experience advertising looks like when it lives inside the story rather than interrupting it. -The three things to look for in every key hire. Learn more at: https://nebius.com/startups/podcast

7. juli 202625 min

How to Spot a $25 Billion Company Before It Has Revenue

Brian Zhan doesn't want your business plan. In fact, if you hand him one at the seed stage, that might raise some red flags. In this episode, Anu Maheshwari, Head of Venture Ecosystem at Nebius, sits down with Brian Zhan, partner at Striker Venture Partners. They unpack what it actually means to back people, not products, at the frontier of AI. From reinforcement learning's generalization advantages, to recursive self-improvement loops in chip design, Brian makes the case that the VC consensus is missing the most important early bets by waiting for revenue. Brian has been on the boards of Reflection, Skilled, and Periodic Labs. With a personal history in robotics research dating back to 2014, he maps out where the next trillion-dollar companies will emerge, why most VCs are too focused on revenue timelines to see them, and how Striker approaches seed investing with 10% of fund concentration per bet. 0:00 - Intro teaser: Why the best founders don't need a business plan 0:27 - Why Striker invests before there's a business plan 5:04 - Is AI in a bubble? Separating hype from opportunity 7:45 - How Brian identifies world-changing founders 9:24 - Reinforcement learning and the path to superintelligence 14:05 - Robotics, world models, and the future of physical AI 17:55 - Why most VCs think too small 20:18 - What Brian looks for in a founder pitch 23:37 - The OpenAI playbook & why compute partnerships matter 26:05 - Outro What You'll Learn: -Why a detailed business plan at seed is unnecessary. -The precision behind frontier AI investing and why it's fundamentally different from SaaS. -Why reinforcement learning underpins recursive self-improvement and why most investors don't actually understand it. -How world models and simulation will accelerate robotics toward its ChatGPT moment. -Solving energy supply by unlocking insight for everything else. -How Striker evaluates founder-market fit in practice.

30. juni 202626 min

How Tavily Got to 1M Users Without Sending One Cold Email

23. juni 202622 min

Real-Time World Models, Explained by a Founder Building Them

In October 2024, Alberto Taiuti was trying to solve a 3D asset problem and ran into a wall. The labeled data didn't exist. What he realized next reshaped the trajectory of his company. In this episode, Josh Liss (Head of Media & Entertainment at Nebius) sits down with Alberto Taiuti, CEO and Co-founder of Reactor, the developer platform for world models. They unpack what real-time world models actually are, why they're fundamentally different from the AI video tools most people have encountered, and why he believes the era of prompting is already winding down. Drawing on his background at Apple, where he worked on the Vision Pro, Alberto makes the case that interactive video changes what a developer platform has to be, why video models already understand 3D space (even though no one trained them to), and how shows, ads, and even software interfaces could be generated live, per user, in real time. 0:00 - Intro teaser: writing code, not prompts 0:25 - Guest intro & Reactor overview 0:47 - What is a world model? 2:19 - World models are regressive like LLMs 5:21 - Alberto's background & origin story 8:08 - The "aha" moment that changed everything 9:58 - How world models will impact media & entertainment 12:48 - "We only hire unreasonable people" — hiring philosophy 13:54 - Building in stealth: lessons from Apple 17:48 - GTA San Andreas shaped his entire career What You'll Learn What a world model is, in plain language, and how it differs from standard AI video generation. Alberto's philosophy for building in stealth mode and how that impacts his team Why Alberto believes interactive, real-time video brings code back to the center of AI development. The insight about video models and 3D that became the foundation of Reactor. Reactor's hiring philosophy and why "unreasonable" is the highest compliment at the company. Alberto's three-year prediction for how applications themselves get built.

16. juni 202620 min

How Tavily Got to 1M Users Without Sending One Cold Email

Description

Comments

1 month for 9 kr.

All episodes