Week 44, end of the week episode

Beskrivelse

Welcome to AI & Consciousness, where we dissect the latest AI breakthroughs and ethical dilemmas. Today, we’re navigating a whirlwind of news. We have groundbreaking academic research on AI reasoning and efficiency, alongside major industry moves like OpenAI’s new security agent and Nvidia’s staggering $5 trillion valuation. We'll explore the rise of autonomous AI agents, the high-stakes economics of the AI boom, and the urgent question that looms over it all: As these systems grow more powerful, who is held accountable? Let’s dive in.

Week 43, mid week episode

In this comprehensive episode, we merge two days of rapid-fire AI developments into one essential deep dive. We explore the groundbreaking new frameworks that are finally helping to define and measure Artificial General Intelligence (AGI) and revealing the surprising limits of "reflection" in today's most advanced models. We'll cover major breakthroughs in healthcare, where AIs like DeepSomatic are finding previously missed cancer variants in children, and in law, where new systems are redefining corporate accountability in the age of algorithms. From clever 'jailbreaking' attacks that expose critical vulnerabilities to the fun side of AI playing Dungeons & Dragons, this episode covers the innovations, the risks, and the societal shifts shaping our world. Sharing a lot of references from this episode. * References and Further Reading * A Definition of AGI: Proposes a quantifiable framework to define and measure Artificial General Intelligence based on human cognitive abilities. * DeepSomatic: Details Google's AI model that identified 10 previously missed genetic variants in pediatric leukemia cells. * Distractor Injection Attacks: Reveals how top LLMs can be distracted by irrelevant tasks, cutting task accuracy by up to 60%. * DTKG: A dual-track knowledge graph framework that improves complex multi-hop question answering in RAG systems. * From Local to Global (GISP): Introduces GISP, a structured pruning method making LLMs up to 50% smaller without losing performance. * FST.ai [http://FST.ai] 2.0: An explainable AI system to assist Taekwondo referees, reducing decision review times by 85%. * Illusions of reflection: Shows that frontier LLMs lack functional, goal-driven reflective reasoning, a key gap in current AI capabilities. * Is Multilingual LLM Watermarking Truly Multilingual? (STEAM): Presents STEAM, a method using back-translation to fix fairness issues and ensure watermarking works in low-resource languages. * Na Prática, qual IA Entende o Direito?: A study finding that a specialized legal AI (JusIA) significantly outperforms general models like ChatGPT on legal tasks. * Operationalising Extended Cognition: Proposes a legal framework for holding corporations accountable for decisions made by their AI systems. * The Right to Be Remembered: Argues for a digital right to combat the erasure of minority voices and cultural memory by LLMs. * Team-Phi: A multi-agent framework that automatically evaluates and selects models for anonymizing patient health data. * VERA-V: A framework that automates the discovery of 'jailbreak' vulnerabilities in multimodal AIs like GPT-4o. * What Limits Agentic Systems Efficiency? (SpecCache): Introduces SpecCache, a method to speed up web-based AI agents by up to 3.2x via intelligent caching.

22. okt. 20258 min

Week 44, end of the week episode

Beskrivelse

Kommentarer

2 Måneder for 19 kr

Alle episoder