The Nebius for Startups Podcast
Tavily hit one million users without a single outbound cold email. Not because of a viral moment, but because the founder understood earlier than most that the unit economics of agentic search are fundamentally different. When you’re returning context to an AI agent rather than links to a human, token waste, latency, and information density become the variables that determine whether your product survives in production. Tavily built its architecture around that constraint from day one, and developers noticed. After being mentioned in 200+ research papers and 1,000+ YouTube videos, Fortune 500 enterprises were already running Tavily internally before a sales conversation ever happened. In this episode, Markiesha Patrice (Head of Startup Platform & Community at Nebius) sits down with Rotem Weiss, Founder and CEO of Tavily, to get into the technical and strategic decisions behind building the search layer for the agentic web. Rotem breaks down why optimizing for snippet retrieval over link retrieval changes the entire architecture of a search product, how Tavily's prediction engine pre-processes likely queries to cut latency without sacrificing accuracy, and why he believes the most important GTM decision an AI infrastructure company can make in 2026 is getting integrated into the coding assistants autonomously selecting tools on behalf of engineers. 0:00 - Intro teaser: the buyer isn't a human anymore 0:25 - Guest intro & what Tavily does 0:47 - Rotem's origin story & the first check 1:40 - The real-time data gap & launching GPT Researcher 4:30 - What is agentic search? Snippets, not links 6:55 - The prediction engine that kills latency 7:55 - Why 2026 is the year of token optimization 10:05 - Inside the Nebius acquisition 11:25 - The team & the "second CEO" 12:45 - "The buyer is the AI assistant" — winning coding tools 14:30 - GPUs as rocket fuel 16:30 - One million users, zero cold emails 19:20 - "I'm the worst at everything in the company" 20:30 - Israel vs. the US founder mindset What You'll Learn -Why returning the right snippet rather than the right link changes the entire architecture of a search product. -The prediction engine Tavily uses to pre-process likely queries at the website level, reducing processing overhead without sacrificing accuracy. -Why coding assistants autonomously selecting tools in a stack are becoming the real GTM lever for AI infrastructure companies. -The open-source to enterprise conversion motion Rotem actually used. -How the migration from large closed-source models to fine-tuned open-source models changes the token economics for AI companies trying to get to production. Learn more at: https://nebius.com/startups/podcast
19 episodes
Comments
0Be the first to comment
Sign up now and become a member of the The Nebius for Startups Podcast community!