Forsidebilde av showet Women in AI Research (WiAIR)

Women in AI Research (WiAIR)

Podkast av WiAIR

engelsk

Teknologi og vitenskap

Prøv gratis i 14 dager

99 kr / Måned etter prøveperioden.Avslutt når som helst.

  • 20 timer lydbøker i måneden
  • Eksklusive podkaster
  • Gratis podkaster
Prøv gratis

Les mer Women in AI Research (WiAIR)

Women in AI Research (WiAIR) is a podcast dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our mission is to challenge the prevailing perception that AI research is predominantly male-driven. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this rapidly growing field. You will learn from women at different career stages, stay updated on the latest research and advancements, and hear powerful stories of overcoming obstacles and breaking stereotypes.

Alle episoder

30 Episoder

episode 100% Jailbreak Success? The Hard Truth About AI Safety, with Dr. Saadia Gabriel (Part 2) cover

100% Jailbreak Success? The Hard Truth About AI Safety, with Dr. Saadia Gabriel (Part 2)

What actually happens when AI systems fail in the real world? In this final part of our conversation with Saadia Gabriel (UCLA), we unpack one of the most urgent challenges in modern AI: why even the most advanced models remain vulnerable to manipulation - and what that means for safety, fairness, and society. From multi-turn jailbreaking attacks with near 100% success rates to misinformation shaping human beliefs, this conversation goes beyond surface-level concerns and dives into how harms actually emerge in deployed systems. We explore: * Why current guardrails are not enough * How realistic attack scenarios differ from academic benchmarks * The connection between model vulnerabilities and societal harm * What AI can (and cannot) do about misinformation and persuasion * The open research problems that still don’t have solutions Resources & Links: * Generative AI in the Era of 'Alternative Facts' [https://aclanthology.org/2024.emnlp-main.487/] * ModelCitizens: Representing Community Voices in Online Safety [https://aclanthology.org/2025.emnlp-main.1571/] * Translation as a Scalable Proxy for Multilingual Evaluation [https://arxiv.org/abs/2601.11778] Connect with Dr. Saadia Gabriel: * https://x.com/GabrielSaadia [https://x.com/GabrielSaadia] * https://bsky.app/profile/skgabrie.bsky.social [https://bsky.app/profile/skgabrie.bsky.social]

17. april 2026 - 33 min
episode From Hate Speech to Best Paper: Building Safer AI Systems, with Dr. Saadia Gabriel (Part 1) cover

From Hate Speech to Best Paper: Building Safer AI Systems, with Dr. Saadia Gabriel (Part 1)

What does it mean to build AI systems we can actually trust? In this first part of our conversation with Saadia Gabriel (UCLA), we explore the deeply personal and technical journey behind her work on AI safety, misuse, and responsible NLP. From experiencing targeted hate speech firsthand to receiving a best paper nomination, Saadia shares how her lived experience shaped her research — and why language models must be designed with both capability and risk in mind. 🧠 In this episode, we cover: * How personal experiences influence AI research directions * The intersection of NLP, security, and privacy * Why LLMs can be both powerful and dangerous * What it means to build trustworthy AI systems * Lessons from working across multiple research paradigms * How to pursue high-impact research as a PhD or early-career scientist Resources & Links: * X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents [https://arxiv.org/abs/2504.13203] Connect with Dr. Saadia Gabriel: * https://x.com/GabrielSaadia [https://x.com/GabrielSaadia] * https://bsky.app/profile/skgabrie.bsky.social [https://bsky.app/profile/skgabrie.bsky.social]

15. april 2026 - 29 min
episode EACL 2026: LLMs Can Hear… But Can They Reason? A New Benchmark for Audio Intelligence cover

EACL 2026: LLMs Can Hear… But Can They Reason? A New Benchmark for Audio Intelligence

What does it actually mean for a model to understand audio Paper: https://arxiv.org/abs/2601.19673 [https://arxiv.org/abs/2601.19673] In this episode, I talk with Iwona Christop, a PhD student at Adam Mickiewicz University, about her recent EACL paper introducing ART (Audio Reasoning Tasks) — a new benchmark designed to evaluate whether multimodal LLMs can truly reason over audio, not just transcribe or classify it. Most existing benchmarks test audio skills in isolation (like ASR or classification). But real-world intelligence requires something deeper: combining signals, comparing sounds, tracking context, and making decisions. This work takes a different approach: * No text-only shortcuts — tasks can’t be solved via transcription alone * Reasoning-first design — models must combine multiple audio cues * No expert knowledge required — anyone can verify correctness We also dive into the diverse task design, including: * Audio arithmetic (counting and comparing sounds) * Cross-recording speaker & language identification * Sound-based reasoning (e.g., inferring properties from audio) * Speech feature comparison (accents, variations) * Multimodal reasoning across text and sound The dataset includes 9 tasks, 9,000 samples, and 30+ hours of audio — all generated in a scalable way using templates and TTS. 👉 If you care about multimodal reasoning, evaluation, or the limits of current LLM capabilities, this conversation is for you. Iwona Christop: https://www.linkedin.com/in/iwona-christop/ [https://www.linkedin.com/in/iwona-christop/] 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon #WiAIR #EACL2026

13. april 2026 - 18 min
episode EACL 2026: LLMs Can Call Tools -- But Can They Understand Them? cover

EACL 2026: LLMs Can Call Tools -- But Can They Understand Them?

LLM-based agents are everywhere, but most research focuses on just one step: getting the model to call the right tool. What happens after that? Paper: https://arxiv.org/abs/2510.15955 [https://arxiv.org/abs/2510.15955] In this talk, Kiran Kate (IBM Research) presents new findings from their EACL 2026 paper on a largely overlooked problem:👉 Can LLMs actually understand and use the outputs returned by tools? As tool-augmented systems become more complex, this question becomes critical. The work dives into how current models handle non-trivial, real-world tool responses, and where they break down. 💡 Key ideas covered: * Why tool calling is only half the story in LLM agents * The challenge of processing complex tool outputs * Failure modes in current LLM-based systems * What this means for building robust, real-world AI agents This talk is especially relevant if you're working on: * LLM agents and tool use * Evaluation of LLM capabilities * Real-world deployment of AI systems * Agentic workflows and reasoning pipelines Kiran Kate: * https://www.linkedin.com/in/kiran-kate-8b98672/ [https://www.linkedin.com/in/kiran-kate-8b98672/] 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon

12. april 2026 - 22 min
episode EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems cover

EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems

In this episode of #WiAIRpodcast, we dive into a subtle but critical question: Does adding reasoning actually make LLMs safer and more reliable? Paper: https://arxiv.org/abs/2510.21049 [ https://arxiv.org/abs/2510.21049] Atoosa Chegini (University of Maryland, Apple) presents Reasoning's Razor (EACL 2026), where she and her collaborators examine how reasoning impacts high-stakes binary classification tasks, including safety filtering and hallucination detection. Their findings highlight an important nuance: * While reasoning can improve overall accuracy, it may degrade performance at low false positive rates -- exactly where real-world systems need to operate. This conversation covers: * Why accuracy is a misleading metric for safety-critical LLM applications * The importance of evaluating models at fixed false positive rates (FPR) * How two models with identical accuracy can behave completely differently in deployment * The impact of "think-on" (with reasoning) vs "think-off" (no reasoning) settings * Practical implications for RLHF, SFT, and post-training pipelines If you're working on: * LLM evaluation & reliability * AI safety or hallucination detection * Production deployment of language models — this discussion offers a perspective that is both technically grounded and immediately actionable. Atoosa: * https://www.linkedin.com/in/atoosa-chegini-6713741a3/ [https://www.linkedin.com/in/atoosa-chegini-6713741a3/] * https://scholar.google.com/citations?user=5nY9tagAAAAJ&hl=en&oi=ao [https://scholar.google.com/citations?user=5nY9tagAAAAJ&hl=en&oi=ao] 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon

10. april 2026 - 21 min
Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen
Enkelt å finne frem nye favoritter og lett å navigere seg gjennom innholdet i appen
Liker at det er både Podcaster (godt utvalg) og lydbøker i samme app, pluss at man kan holde Podcaster og lydbøker atskilt i biblioteket.
Bra app. Oversiktlig og ryddig. MYE bra innhold⭐️⭐️⭐️

Velg abonnementet ditt

Mest populær

Premium

20 timer lydbøker

  • Eksklusive podkaster

  • Ingen annonser i Podimo shows

  • Avslutt når som helst

Prøv gratis i 14 dager
Deretter 99 kr / måned

Prøv gratis

Premium Plus

100 timer lydbøker

  • Eksklusive podkaster

  • Ingen annonser i Podimo shows

  • Avslutt når som helst

Prøv gratis i 14 dager
Deretter 169 kr / måned

Prøv gratis

Bare på Podimo

Populære lydbøker

Ofte stilte spørsmål

Flere spørsmål og svar
Prøv gratis

Prøv gratis i 14 dager. 99 kr / Måned etter prøveperioden. Avslutt når som helst.