Imagen de portada del programa Women in AI Research (WiAIR)

Women in AI Research (WiAIR)

Podcast de WiAIR

inglés

Tecnología y ciencia

Empieza 7 días de prueba

$99 / mes después de la prueba.Cancela cuando quieras.

  • 20 horas de audiolibros al mes
  • Podcasts solo en Podimo
  • Podcast gratuitos
Prueba gratis

Acerca de Women in AI Research (WiAIR)

Women in AI Research (WiAIR) is a podcast dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our mission is to challenge the prevailing perception that AI research is predominantly male-driven. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this rapidly growing field. You will learn from women at different career stages, stay updated on the latest research and advancements, and hear powerful stories of overcoming obstacles and breaking stereotypes.

Todos los episodios

30 episodios

episode 100% Jailbreak Success? The Hard Truth About AI Safety, with Dr. Saadia Gabriel (Part 2) artwork

100% Jailbreak Success? The Hard Truth About AI Safety, with Dr. Saadia Gabriel (Part 2)

What actually happens when AI systems fail in the real world? In this final part of our conversation with Saadia Gabriel (UCLA), we unpack one of the most urgent challenges in modern AI: why even the most advanced models remain vulnerable to manipulation - and what that means for safety, fairness, and society. From multi-turn jailbreaking attacks with near 100% success rates to misinformation shaping human beliefs, this conversation goes beyond surface-level concerns and dives into how harms actually emerge in deployed systems. We explore: * Why current guardrails are not enough * How realistic attack scenarios differ from academic benchmarks * The connection between model vulnerabilities and societal harm * What AI can (and cannot) do about misinformation and persuasion * The open research problems that still don’t have solutions Resources & Links: * Generative AI in the Era of 'Alternative Facts' [https://aclanthology.org/2024.emnlp-main.487/] * ModelCitizens: Representing Community Voices in Online Safety [https://aclanthology.org/2025.emnlp-main.1571/] * Translation as a Scalable Proxy for Multilingual Evaluation [https://arxiv.org/abs/2601.11778] Connect with Dr. Saadia Gabriel: * https://x.com/GabrielSaadia [https://x.com/GabrielSaadia] * https://bsky.app/profile/skgabrie.bsky.social [https://bsky.app/profile/skgabrie.bsky.social]

17 de abr de 2026 - 33 min
episode From Hate Speech to Best Paper: Building Safer AI Systems, with Dr. Saadia Gabriel (Part 1) artwork

From Hate Speech to Best Paper: Building Safer AI Systems, with Dr. Saadia Gabriel (Part 1)

What does it mean to build AI systems we can actually trust? In this first part of our conversation with Saadia Gabriel (UCLA), we explore the deeply personal and technical journey behind her work on AI safety, misuse, and responsible NLP. From experiencing targeted hate speech firsthand to receiving a best paper nomination, Saadia shares how her lived experience shaped her research — and why language models must be designed with both capability and risk in mind. 🧠 In this episode, we cover: * How personal experiences influence AI research directions * The intersection of NLP, security, and privacy * Why LLMs can be both powerful and dangerous * What it means to build trustworthy AI systems * Lessons from working across multiple research paradigms * How to pursue high-impact research as a PhD or early-career scientist Resources & Links: * X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents [https://arxiv.org/abs/2504.13203] Connect with Dr. Saadia Gabriel: * https://x.com/GabrielSaadia [https://x.com/GabrielSaadia] * https://bsky.app/profile/skgabrie.bsky.social [https://bsky.app/profile/skgabrie.bsky.social]

15 de abr de 2026 - 29 min
episode EACL 2026: LLMs Can Hear… But Can They Reason? A New Benchmark for Audio Intelligence artwork

EACL 2026: LLMs Can Hear… But Can They Reason? A New Benchmark for Audio Intelligence

What does it actually mean for a model to understand audio Paper: https://arxiv.org/abs/2601.19673 [https://arxiv.org/abs/2601.19673] In this episode, I talk with Iwona Christop, a PhD student at Adam Mickiewicz University, about her recent EACL paper introducing ART (Audio Reasoning Tasks) — a new benchmark designed to evaluate whether multimodal LLMs can truly reason over audio, not just transcribe or classify it. Most existing benchmarks test audio skills in isolation (like ASR or classification). But real-world intelligence requires something deeper: combining signals, comparing sounds, tracking context, and making decisions. This work takes a different approach: * No text-only shortcuts — tasks can’t be solved via transcription alone * Reasoning-first design — models must combine multiple audio cues * No expert knowledge required — anyone can verify correctness We also dive into the diverse task design, including: * Audio arithmetic (counting and comparing sounds) * Cross-recording speaker & language identification * Sound-based reasoning (e.g., inferring properties from audio) * Speech feature comparison (accents, variations) * Multimodal reasoning across text and sound The dataset includes 9 tasks, 9,000 samples, and 30+ hours of audio — all generated in a scalable way using templates and TTS. 👉 If you care about multimodal reasoning, evaluation, or the limits of current LLM capabilities, this conversation is for you. Iwona Christop: https://www.linkedin.com/in/iwona-christop/ [https://www.linkedin.com/in/iwona-christop/] 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon #WiAIR #EACL2026

13 de abr de 2026 - 18 min
episode EACL 2026: LLMs Can Call Tools -- But Can They Understand Them? artwork

EACL 2026: LLMs Can Call Tools -- But Can They Understand Them?

LLM-based agents are everywhere, but most research focuses on just one step: getting the model to call the right tool. What happens after that? Paper: https://arxiv.org/abs/2510.15955 [https://arxiv.org/abs/2510.15955] In this talk, Kiran Kate (IBM Research) presents new findings from their EACL 2026 paper on a largely overlooked problem:👉 Can LLMs actually understand and use the outputs returned by tools? As tool-augmented systems become more complex, this question becomes critical. The work dives into how current models handle non-trivial, real-world tool responses, and where they break down. 💡 Key ideas covered: * Why tool calling is only half the story in LLM agents * The challenge of processing complex tool outputs * Failure modes in current LLM-based systems * What this means for building robust, real-world AI agents This talk is especially relevant if you're working on: * LLM agents and tool use * Evaluation of LLM capabilities * Real-world deployment of AI systems * Agentic workflows and reasoning pipelines Kiran Kate: * https://www.linkedin.com/in/kiran-kate-8b98672/ [https://www.linkedin.com/in/kiran-kate-8b98672/] 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon

12 de abr de 2026 - 22 min
episode EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems artwork

EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems

In this episode of #WiAIRpodcast, we dive into a subtle but critical question: Does adding reasoning actually make LLMs safer and more reliable? Paper: https://arxiv.org/abs/2510.21049 [ https://arxiv.org/abs/2510.21049] Atoosa Chegini (University of Maryland, Apple) presents Reasoning's Razor (EACL 2026), where she and her collaborators examine how reasoning impacts high-stakes binary classification tasks, including safety filtering and hallucination detection. Their findings highlight an important nuance: * While reasoning can improve overall accuracy, it may degrade performance at low false positive rates -- exactly where real-world systems need to operate. This conversation covers: * Why accuracy is a misleading metric for safety-critical LLM applications * The importance of evaluating models at fixed false positive rates (FPR) * How two models with identical accuracy can behave completely differently in deployment * The impact of "think-on" (with reasoning) vs "think-off" (no reasoning) settings * Practical implications for RLHF, SFT, and post-training pipelines If you're working on: * LLM evaluation & reliability * AI safety or hallucination detection * Production deployment of language models — this discussion offers a perspective that is both technically grounded and immediately actionable. Atoosa: * https://www.linkedin.com/in/atoosa-chegini-6713741a3/ [https://www.linkedin.com/in/atoosa-chegini-6713741a3/] * https://scholar.google.com/citations?user=5nY9tagAAAAJ&hl=en&oi=ao [https://scholar.google.com/citations?user=5nY9tagAAAAJ&hl=en&oi=ao] 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon

10 de abr de 2026 - 21 min
Muy buenos Podcasts , entretenido y con historias educativas y divertidas depende de lo que cada uno busque. Yo lo suelo usar en el trabajo ya que estoy muchas horas y necesito cancelar el ruido de al rededor , Auriculares y a disfrutar ..!!
Muy buenos Podcasts , entretenido y con historias educativas y divertidas depende de lo que cada uno busque. Yo lo suelo usar en el trabajo ya que estoy muchas horas y necesito cancelar el ruido de al rededor , Auriculares y a disfrutar ..!!
Fantástica aplicación. Yo solo uso los podcast. Por un precio módico los tienes variados y cada vez más.
Me encanta la app, concentra los mejores podcast y bueno ya era ora de pagarles a todos estos creadores de contenido

Elige tu suscripción

Más populares

Premium

20 horas de audiolibros

  • Podcasts solo en Podimo

  • Disfruta los shows de Podimo sin anuncios

  • Cancela cuando quieras

Empieza 7 días de prueba
Después $99 / mes

Prueba gratis

Sólo en Podimo

Audiolibros populares

Preguntas frecuentes

Más preguntas y respuestas
Prueba gratis

Empieza 7 días de prueba. $99 / mes después de la prueba. Cancela cuando quieras.