Forsidebilde av showet Convo AI World

Convo AI World

Podkast av Agora

engelsk

Nyheter og politikk

Prøv gratis i 14 dager

99 kr / Måned etter prøveperioden.Avslutt når som helst.

  • 20 timer lydbøker i måneden
  • Eksklusive podkaster
  • Gratis podkaster
Prøv gratis

Les mer Convo AI World

Explore voice-first conversational AI through honest conversations with practitioners. Hear from AI builders, infra engineers, product strategists, and more for the latest insights on what it takes to build best-in-class conversational AI experiences. Check out video episodes and subscribe to the Convo AI Newsletter at podcast.convoai.world

Alle episoder

28 Episoder

episode How to Catch AI failures cover

How to Catch AI failures

In this episode, host Hermes Frangoudis interviews Faraz Siddiqi, co-founder and CTO of Bluejay, about testing and monitoring voice AI agents. Faraz explains how Bluejay uses "digital humans" synthetic customers with accents, background noise, and even emotions to run hundreds of parallel simulations. What used to take 10–12 hours of manual calling to test 200 menu items now takes just five minutes, letting you grab coffee while Bluejay finds failures before real users hit them. He shares why latency over 1.5 seconds breaks trust, how a transcription error turned "Dr. Pham" into "Dr. Fan," and why founders should listen to every call manually before automating. A must-watch for anyone building production-grade voice agents. Check out video episodes and subscribe to the Convo AI Newsletter at convoai.world

28. mai 2026 - 44 min
episode Speech to Text Masterclass: The Tech Behind the Modern Voice Pipeline cover

Speech to Text Masterclass: The Tech Behind the Modern Voice Pipeline

In this special montage episode of the Convo AI World Podcast, host Hermes Frangoudis brings together leading researchers and founders from across the speech‑to‑text space to unpack the voice pipeline from the ground up. The conversation covers how cascading architectures stack up against real‑time speech‑to‑speech systems, why Voice Activity Detection acts as the critical traffic controller, and how enterprises can eliminate model pathologies like hallucinations and omissions through modular “Lego block” integration. Guests from Deepgram, Agora, Soniox, and Rime share hard‑won lessons on achieving near‑native accuracy across 60 languages with self‑supervised learning, taming unpredictable pronunciations in LLM‑driven agents, and why truly human‑like emotional understanding is still around the corner. The episode confronts the persistent myth that speech recognition is a solved problem, spotlighting the long tail of accents, rare words, and noisy real‑world conditions that still break most systems, and makes the case that for regulated, high‑stakes industries the auditable text backbone of cascading pipelines remains essential even as speech‑to‑speech models race toward a more natural future. Check out video episodes and subscribe to the Convo AI Newsletter at convoai.world

14. mai 2026 - 57 min
episode The Hidden Complexities of Enterprise Voice cover

The Hidden Complexities of Enterprise Voice

In this episode, host Rishi Ahluwalia interviews Ankur Edkie, CEO of Murf AI, about the hidden complexities of scaling enterprise voice AI. Ankur discusses the journey from pre-ChatGPT text-to-speech to building hyper-efficient, human-like voice systems. He unpacks why voice demos feel magical, but production deployments often fail, highlighting the critical gap between lab settings and the variability of real-world acoustics and devices. The conversation covers the challenges of the cascading stack (ASR, LLM, TTS), the importance of turn-taking and latency consistency over raw speed, and Murf’s "compute acquisition" approach with the Falcon architecture that drives down costs and enables global data residency. Ankur concludes with a hard lesson for builders: enterprises are buying holistic outcomes and trust, not just isolated API calls. Check out video episodes and subscribe to the Convo AI Newsletter at convoai.world

16. april 2026 - 43 min
episode Why Developers Are Abandoning GPU Farms for APIs cover

Why Developers Are Abandoning GPU Farms for APIs

In this episode, host Derek interviews Zeyi, Founder & CEO of WaveSpeedAI, about simplifying AI media generation. Zeyi explains how WaveSpeed provides a unified, high-concurrency API that lets developers easily switch between models like Flux and Wan without managing complex GPU infrastructure. He highlights unique offerings like the 10-minute lip-sync tool InfiniteTalk and video extending, discusses regional preferences in AI-generated faces, and shares cost-saving strategies like low-res generation with upscaling. The episode wraps with Zeyi's advice for AI builders: leverage AI coding tools, structure projects wisely, and use API platforms instead of optimizing models from scratch. Check out video episodes and subscribe to the Convo AI Newsletter at convoai.world

9. april 2026 - 29 min
Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen
Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen
Liker at det er både Podcaster (godt utvalg) og lydbøker i samme app, pluss at man kan holde Podcaster og lydbøker atskilt i biblioteket.
Bra app. Oversiktlig og ryddig. MYE bra innhold⭐️⭐️⭐️

Velg abonnementet ditt

Mest populær

Premium

20 timer lydbøker

  • Eksklusive podkaster

  • Ingen annonser i Podimo shows

  • Avslutt når som helst

Prøv gratis i 14 dager
Deretter 99 kr / måned

Prøv gratis

Premium Plus

100 timer lydbøker

  • Eksklusive podkaster

  • Ingen annonser i Podimo shows

  • Avslutt når som helst

Prøv gratis i 14 dager
Deretter 169 kr / måned

Prøv gratis

Bare på Podimo

Populære lydbøker

Ofte stilte spørsmål

Flere spørsmål og svar
Prøv gratis

Prøv gratis i 14 dager. 99 kr / Måned etter prøveperioden. Avslutt når som helst.