Forsidebilde av showet Talking Machines by SU PARK

Talking Machines by SU PARK

Podkast av Su Park

engelsk

Teknologi og vitenskap

Tidsbegrenset tilbud

2 Måneder for 19 kr

Deretter 99 kr / MånedAvslutt når som helst.

  • 20 timer lydbøker i måneden
  • Eksklusive podkaster
  • Gratis podkaster
Kom i gang

Les mer Talking Machines by SU PARK

Join Su Park as she invites various guests to unpack the hottest Artificial Intelligence papers off the press. Each episode dives into the newest discoveries in AI and the sci-fi-slowly-becoming-our-reality era we’re living in.

Alle episoder

9 Episoder

episode LLM as a Judge: Evaluating AI with AI cover

LLM as a Judge: Evaluating AI with AI

In this episode of "Talking Machines by Su Park," we explore the fascinating concept of "LLM-as-a-Judge," which evaluates the role of large language models in providing scalable assessments across various domains. As AI continues to evolve, understanding how these models can bridge the gap between human insight and algorithmic efficiency becomes increasingly significant. The discussion highlights the growing trend of utilizing LLMs not only to evaluate other AI systems but also to enhance the evaluation process itself, bringing consistency to an area that often suffers from human bias and variability. Key insights from the conversation include the potential for LLMs to merge the strengths of expert evaluations with the speed and scalability of automated assessments. The episode further delves into the challenges of implementing reliable LLM-as-a-Judge systems, emphasizing the need to address biases and ensure consistent evaluations. These insights underscore the implications of integrating LLMs into evaluation processes, paving the way for more effective and nuanced assessments in the future. "A Survey on LLM-as-a-Judge": https://arxiv.org/abs/2411.15594

19. april 2025 - 19 min
episode How to Pick the Best Pretraining Data cover

How to Pick the Best Pretraining Data

In this episode of "Talking Machines by Su Park," the hosts explore the critical topic of selecting pretraining datasets for Large Language Models, a decision that significantly impacts model performance and cost-efficiency. The discussion centers on a recent paper from the Allen Institute for AI, which introduces a novel approach to optimizing dataset selection without extensive computational resources, thereby addressing a key challenge in AI research. The episode highlights two major insights from the paper. First, the proposed suite of models, known as DATADECIDE, allows researchers to effectively predict which datasets will yield the best results for larger models based on smaller-scale experiments. This method has been shown to achieve approximately 80% accuracy in predicting performance outcomes, thus reducing the need for costly trial-and-error approaches. Additionally, the research reveals which benchmarks correlate with high performance, offering valuable guidance for future dataset selection in AI training. "DataDecide: How to Predict Best Pretraining Data with Small Experiments" by Allen Institute for AI: https://arxiv.org/abs/2504.11393

18. april 2025 - 17 min
episode How AI Learns Mid-Conversation cover

How AI Learns Mid-Conversation

In this episode of "Talking Machines by Su Park," the discussion centers on the innovative concept of the Dynamic Cheatsheet (DC) for language models. This framework enhances the memory capabilities of AI systems during inference, enabling them to retain and apply insights from previous interactions. The significance of this development lies in its potential to transform how language models operate, moving away from treating each query as a standalone task to a more integrated approach that can lead to improved efficiency and problem-solving capabilities. Key insights from the conversation include the remarkable performance improvements observed with the implementation of DC. For instance, the accuracy of Claude 3.5 Sonnet in algebraic tasks more than doubled as it retained relevant insights, while GPT-4o's success rate on the Game of 24 puzzle soared from 10% to 99% after leveraging a reusable Python-based solution. This episode highlights how effective memory structuring in AI can enhance its ability to tackle similar challenges, akin to having a toolbox of solutions readily available for diverse problems. Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory: https://arxiv.org/abs/2504.07952

16. april 2025 - 17 min
episode Alone Together: The Emotional Cost of Chatting with AI cover

Alone Together: The Emotional Cost of Chatting with AI

In this episode of "Talking Machines by Su Park," the focus is on the emotional impact of chatbot interactions on mental health and social dynamics. The significance of this discussion lies in understanding how technology, specifically AI-driven chatbots, can influence feelings of loneliness and social connection in users, a topic that has become increasingly relevant in our digitally connected yet often isolated lives. Key insights from the episode reveal that while chatbots can initially alleviate feelings of loneliness, excessive interaction—particularly with voice-based bots—can paradoxically lead to heightened loneliness and emotional dependence. The researchers conducted a comprehensive four-week study involving nearly a thousand participants, analyzing over 300,000 messages to assess how different types of conversations, especially personal versus non-personal topics, affect psychosocial outcomes. This nuanced understanding underscores the complex relationship between human emotions and AI interactions. How AI and Human Behaviors Shape Psychosocial Effects of Chatbot Use: A Longitudinal Randomized Controlled Study by MIT Media Lab & OpenAI: https://www.media.mit.edu/publications/how-ai-and-human-behaviors-shape-psychosocial-effects-of-chatbot-use-a-longitudinal-controlled-study/

10. april 2025 - 19 min
episode Tom, Jerry, and the Neural Net: AI’s Leap in Video Storytelling cover

Tom, Jerry, and the Neural Net: AI’s Leap in Video Storytelling

In this episode of "Talking Machines by Su Park," the hosts explore a groundbreaking paper focused on generating one-minute videos using a novel approach called Test-Time Training (TTT) layers. This topic is significant as it addresses the limitations of current video generation models, which typically produce only short clips, often around 20 seconds. By leveraging TTT layers, the researchers aim to enhance both the length and narrative complexity of generated videos, showcasing their method through the engaging context of Tom and Jerry cartoons. Key insights from the discussion include the innovative use of TTT layers to make hidden states more expressive, effectively allowing the model to function like a neural network at critical moments. This enhancement leads to a notable improvement in the coherence of the generated stories, with the researchers reporting a 34% performance boost over existing models. The implications of this work suggest a more advanced capability for AI in video generation, paving the way for richer and more complex visual storytelling. One-Minute Video Generation with Test-Time Training by NVIDIA: https://arxiv.org/abs/2504.05298

9. april 2025 - 23 min
Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen
Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen
Liker at det er både Podcaster (godt utvalg) og lydbøker i samme app, pluss at man kan holde Podcaster og lydbøker atskilt i biblioteket.
Bra app. Oversiktlig og ryddig. MYE bra innhold⭐️⭐️⭐️

Velg abonnementet ditt

Mest populær

Tidsbegrenset tilbud

Premium

20 timer lydbøker

  • Eksklusive podkaster

  • Ingen annonser i Podimo shows

  • Avslutt når som helst

2 Måneder for 19 kr
Deretter 99 kr / Måned

Kom i gang

Premium Plus

100 timer lydbøker

  • Eksklusive podkaster

  • Ingen annonser i Podimo shows

  • Avslutt når som helst

Prøv gratis i 14 dager
Deretter 169 kr / måned

Prøv gratis

Bare på Podimo

Populære lydbøker

Ofte stilte spørsmål

Flere spørsmål og svar
Kom i gang

2 Måneder for 19 kr. Deretter 99 kr / Måned. Avslutt når som helst.