Convo AI World
In this episode, host Rishi Ahluwalia interviews Ankur Edkie, CEO of Murf AI, about the hidden complexities of scaling enterprise voice AI. Ankur discusses the journey from pre-ChatGPT text-to-speech to building hyper-efficient, human-like voice systems. He unpacks why voice demos feel magical, but production deployments often fail, highlighting the critical gap between lab settings and the variability of real-world acoustics and devices. The conversation covers the challenges of the cascading stack (ASR, LLM, TTS), the importance of turn-taking and latency consistency over raw speed, and Murf’s "compute acquisition" approach with the Falcon architecture that drives down costs and enables global data residency. Ankur concludes with a hard lesson for builders: enterprises are buying holistic outcomes and trust, not just isolated API calls. Check out video episodes and subscribe to the Convo AI Newsletter at convoai.world
27 episodes
Comments
0Be the first to comment
Sign up now and become a member of the Convo AI World community!