Talking Machines by SU PARK

Podcast by Su Park

English

Technology & science

Limited Offer

1 month for 9 kr.

Then 99 kr. / monthCancel anytime.

20 hours of audiobooks / month
Podcasts only on Podimo
All free podcasts

Get Started

About Talking Machines by SU PARK

Join Su Park as she invites various guests to unpack the hottest Artificial Intelligence papers off the press. Each episode dives into the newest discoveries in AI and the sci-fi-slowly-becoming-our-reality era we’re living in.

All episodes

9 episodes

LLM as a Judge: Evaluating AI with AI

In this episode of "Talking Machines by Su Park," we explore the fascinating concept of "LLM-as-a-Judge," which evaluates the role of large language models in providing scalable assessments across various domains. As AI continues to evolve, understanding how these models can bridge the gap between human insight and algorithmic efficiency becomes increasingly significant. The discussion highlights the growing trend of utilizing LLMs not only to evaluate other AI systems but also to enhance the evaluation process itself, bringing consistency to an area that often suffers from human bias and variability. Key insights from the conversation include the potential for LLMs to merge the strengths of expert evaluations with the speed and scalability of automated assessments. The episode further delves into the challenges of implementing reliable LLM-as-a-Judge systems, emphasizing the need to address biases and ensure consistent evaluations. These insights underscore the implications of integrating LLMs into evaluation processes, paving the way for more effective and nuanced assessments in the future. "A Survey on LLM-as-a-Judge": https://arxiv.org/abs/2411.15594

19 Apr 2025 - 19 min

How to Pick the Best Pretraining Data

In this episode of "Talking Machines by Su Park," the hosts explore the critical topic of selecting pretraining datasets for Large Language Models, a decision that significantly impacts model performance and cost-efficiency. The discussion centers on a recent paper from the Allen Institute for AI, which introduces a novel approach to optimizing dataset selection without extensive computational resources, thereby addressing a key challenge in AI research. The episode highlights two major insights from the paper. First, the proposed suite of models, known as DATADECIDE, allows researchers to effectively predict which datasets will yield the best results for larger models based on smaller-scale experiments. This method has been shown to achieve approximately 80% accuracy in predicting performance outcomes, thus reducing the need for costly trial-and-error approaches. Additionally, the research reveals which benchmarks correlate with high performance, offering valuable guidance for future dataset selection in AI training. "DataDecide: How to Predict Best Pretraining Data with Small Experiments" by Allen Institute for AI: https://arxiv.org/abs/2504.11393

18 Apr 2025 - 17 min

How AI Learns Mid-Conversation

In this episode of "Talking Machines by Su Park," the discussion centers on the innovative concept of the Dynamic Cheatsheet (DC) for language models. This framework enhances the memory capabilities of AI systems during inference, enabling them to retain and apply insights from previous interactions. The significance of this development lies in its potential to transform how language models operate, moving away from treating each query as a standalone task to a more integrated approach that can lead to improved efficiency and problem-solving capabilities. Key insights from the conversation include the remarkable performance improvements observed with the implementation of DC. For instance, the accuracy of Claude 3.5 Sonnet in algebraic tasks more than doubled as it retained relevant insights, while GPT-4o's success rate on the Game of 24 puzzle soared from 10% to 99% after leveraging a reusable Python-based solution. This episode highlights how effective memory structuring in AI can enhance its ability to tackle similar challenges, akin to having a toolbox of solutions readily available for diverse problems. Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory: https://arxiv.org/abs/2504.07952

16 Apr 2025 - 17 min

Alone Together: The Emotional Cost of Chatting with AI

In this episode of "Talking Machines by Su Park," the focus is on the emotional impact of chatbot interactions on mental health and social dynamics. The significance of this discussion lies in understanding how technology, specifically AI-driven chatbots, can influence feelings of loneliness and social connection in users, a topic that has become increasingly relevant in our digitally connected yet often isolated lives. Key insights from the episode reveal that while chatbots can initially alleviate feelings of loneliness, excessive interaction—particularly with voice-based bots—can paradoxically lead to heightened loneliness and emotional dependence. The researchers conducted a comprehensive four-week study involving nearly a thousand participants, analyzing over 300,000 messages to assess how different types of conversations, especially personal versus non-personal topics, affect psychosocial outcomes. This nuanced understanding underscores the complex relationship between human emotions and AI interactions. How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Randomized Controlled Study by MIT Media Lab & OpenAI: https://www.media.mit.edu/publications/how-ai-and-human-behaviors-shape-psychosocial-effects-of-chatbot-use-a-longitudinal-controlled-study/

10 Apr 2025 - 19 min

Tom, Jerry, and the Neural Net: AI’s Leap in Video Storytelling

In this episode of "Talking Machines by Su Park," the hosts explore a groundbreaking paper focused on generating one-minute videos using a novel approach called Test-Time Training (TTT) layers. This topic is significant as it addresses the limitations of current video generation models, which typically produce only short clips, often around 20 seconds. By leveraging TTT layers, the researchers aim to enhance both the length and narrative complexity of generated videos, showcasing their method through the engaging context of Tom and Jerry cartoons. Key insights from the discussion include the innovative use of TTT layers to make hidden states more expressive, effectively allowing the model to function like a neural network at critical moments. This enhancement leads to a notable improvement in the coherence of the generated stories, with the researchers reporting a 34% performance boost over existing models. The implications of this work suggest a more advanced capability for AI in video generation, paving the way for richer and more complex visual storytelling. One-Minute Video Generation with Test-Time Training by NVIDIA: https://arxiv.org/abs/2504.05298

9 Apr 2025 - 23 min

En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.

Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍

Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Choose your subscription