Holly and Ewan Are Working On It Podcast

Episode 8: Inside the AI Agent Revolution

26 min · 20 de nov de 2025
Portada del episodio Episode 8: Inside the AI Agent Revolution

Descripción

In this episode, Holly and Ewan explore one of the most hyped (yet deeply misunderstood) topics in AI today: AI agents. Holly opens with the big question: What actually is an AI agent? Ewan explains why definitions vary wildly, but broadly defines an AI agent as any system that can operate independently on your behalf to complete tasks. That could be a coaching assistant, a financial helper, or even a household or education agent. Ewan shares real-world stories, such as trying to buy a dishwasher using ChatGPT Agent Mode [https://openai.com/index/introducing-chatgpt-agent/]... Only to find that Amazon actively blocks agent-based access. When he switched to AO.com [https://ao.com/], the agent succeeded instantly - a perfect illustration of today’s fragmented ecosystem. He also discusses experimenting with agents to manage LinkedIn connection acceptance, with mixed results, highlighting how even simple point-solution tasks can quickly fall apart. The discussion then moves into the wider implications: * Why agents are transformational in theory, but fragile and unreliable today * How browser-based agents actually work using “computer use” screenshot loops * Why traditional RPA (Robotic Process Automation) remains far safer and more predictable * Early signs of agent-powered cyberattacks, referencing the first reported case of agentic hacking [https://www.anthropic.com/news/disrupting-AI-espionage] * The Carnegie Mellon “Agent Company [https://agent.company]” benchmark, which evaluates how well different agents perform real office tasks. With current leaderboards showing DeepSeek’s Matrix agent at ~43%, Google Gemini around 41%, and Claude Sonnet 4 around 33%. The conclusion? The vision is exciting, but today’s agents are nowhere near enterprise-ready. Expect rapid evolution, more experiments, and many more failures as this technology matures. If you've got feedback, we'd love to hear it. We reply to every single message! Find us at ⁠Working On It Podcast⁠ [https://www.workingonitpodcast.com/], or follow our ⁠LinkedIn Page⁠ [https://www.linkedin.com/company/wereworkingonit/]. Or talk to ⁠Holly⁠ [https://www.linkedin.com/in/digitalholly/] or ⁠Ewan⁠ [https://www.linkedin.com/in/ewanmacleod] on LinkedIn.

Comentarios

0

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de Holly and Ewan Are Working On It Podcast!

Prueba gratis

Empieza 7 días de prueba

$99 / mes después de la prueba. · Cancela cuando quieras.

  • Podcasts solo en Podimo
  • 20 horas de audiolibros al mes
  • Podcast gratuitos

Todos los episodios

21 episodios

episode Episode 21: AI Fluency for Executives - Learn by Playing artwork

Episode 21: AI Fluency for Executives - Learn by Playing

It's a slightly different episode today - in this one, we're talking about how we're helping executives develop AI fluency with one of the services we're offering, AI Fluency Coach [https://aifluencycoach.com⁠]. Here's the overview: Senior leaders are being asked to make enormous decisions about AI, on investment, direction and ethics, while quietly admitting to themselves that they don't really understand the technology. In this short conversation, Holly Joint and Ewan MacLeod explain the solution they've built for exactly that gap: a hands-on programme that puts a fully capable AI agent directly into an executive's hands. The premise is refreshingly simple. Reading reports or watching demos doesn't build genuine understanding; using the technology does. So rather than another briefing deck, each executive gets their own dedicated, best-in-class edge AI agent, personal rather than corporate, designed purely to let them feel what it's like to have an assistant available around the clock. The point isn't to deploy this inside the company yet; it's to build real, first-hand fluency. The examples bring it to life. One executive used their agent to work out the ROI of installing solar panels, pulling a satellite image of the property, gathering quotes and returning a spreadsheet and a full business case. Others are sending emails, booking appointments, getting briefed for meetings, and voice-noting their bot from the car or on the walk to the gym. Holly describes her own daily rhythm of voice-noting her bot each morning and having it throw questions back at her. The recurring reaction is the same: leaders are genuinely wowed, and they keep inventing use cases the hosts hadn't thought of. There's useful substance on how it works, too. Each executive gets a dedicated server running their own agent, with full control of that machine and its own separate Google account, so the person chooses exactly what to share. Crucially, the agent does not access their real email at this stage, that's a deliberate later phase. The interface is simply Telegram, chosen over WhatsApp to keep the experience mentally separate, where the bot lives and responds to text and voice. What stands out is how fast it works. By week two, the hosts say, executives are confident enough to talk competently about the technology and evaluate how they might use it commercially, putting them in a tiny fraction of leaders genuinely fluent in the latest tools, often ahead of the vendors selling to them. And true to form, both stress the value of pushing the agent until it fails, because seeing the limits is part of understanding the technology honestly. Key Topics * Why senior leaders struggle to build real AI understanding * Learning by doing: a personal agent over theory and demos * A dedicated, private edge AI agent for each executive * Real use cases, from solar-panel ROI to meeting briefings * Voice-noting an AI assistant into your daily routine * How the setup works: dedicated server, separate Google account, Telegram * Reaching confident AI fluency within weeks * The value of testing the technology until it fails Links & References * AI Fluency Coach — https://aifluencycoach.com [https://aifluencycoach.com] * Telegram — https://telegram.org [https://telegram.org]

15 de may de 20265 min
episode Episode 20: OpenClaw: The Always-On AI Agent artwork

Episode 20: OpenClaw: The Always-On AI Agent

After listeners said they felt a little short-changed by an earlier mention of OpenClaw, Holly Joint hands Ewan MacLeod the floor to properly explain what it is, why people are so excited about it, and where the real dangers lie. The result is the season's most practical deep-dive into agentic AI, grounded in how Ewan and even his wife are actually using it day to day. Stripped to its essentials, OpenClaw is the familiar power of a model like Claude or ChatGPT, but running continuously on a server or spare machine rather than waiting for you to open an app. It wakes on a schedule, around every fifteen minutes by default, and you talk to it through WhatsApp, Telegram or Discord. Over time it becomes a genuine assistant: check my email, always flag messages from this person, research this, remind me of that. Because it sits on top of an LLM and can be given a browser and real credentials, its capability is striking. Ewan describes an agent reasoning its way to phoning a restaurant by chaining together a Twilio account and a text-to-speech service, entirely on its own initiative. That autonomy is exactly where the caution comes in. The hosts revisit the cautionary tale of the Meta researcher who had to physically pull the plug before her agent deleted her emails, and Ewan is emphatic about discipline: keep it air-gapped from your real life, give it a clean machine without your iCloud or passwords, run it on a separate email account, and never let it near a corporate network. They cover the practical hygiene too, why a Mac's Unix foundations make control easier than Windows, the Mac Mini fascination, the option of a local LLM versus an API key, the terms-of-service reasons not to point it at Claude Code, and the very real token costs. Ewan candidly puts his own experimentation at around $500 a month, much of it his wife's "George" busily researching holidays and pinging him itinerary ideas. There's a lighter thread running throughout, the named agents (Ewan's Claudia, his wife's George, his chief-of-staff Marvin) and Holly's joke that she could just let OpenClaw manage her marriage. But the serious payoff lands at the end, where Ewan explains the AI-fluency programme he runs for senior executives: a carefully controlled, six-week introduction with a sandboxed instance, designed so leaders experience both the magic and, crucially, the failures. His argument is that strategic AI decisions are not technical-domain questions, and you cannot make them well without having felt the technology yourself, including the moments it disappoints. Key Topics * What OpenClaw is and how it differs from Claude Cowork and Dispatch * Always-on, scheduled agents you talk to via Telegram or WhatsApp * How an agent chains tools together to act autonomously * Safety first: air-gapping, clean machines, separate accounts * Practical setup: Mac versus PC, local LLMs, API keys, token costs * Real-world use, and the roughly $500-a-month reality of experimenting * Why naming agents reveals how human they feel * A six-week executive programme built around experiencing failure Links & References * Anthropic (Claude, Cowork, Dispatch) — https://www.anthropic.com [https://www.anthropic.com] * Telegram — https://telegram.org [https://telegram.org] * Brave Search — https://search.brave.com [https://search.brave.com] * OpenClaw — https://openclaw.ai/ [https://openclaw.ai/]

8 de may de 202622 min
episode Episode 19: The Board You Can Actually Afford artwork

Episode 19: The Board You Can Actually Afford

In this hands-on episode, Ewan MacLeod turns interviewer again to find out exactly what Holly Joint has been building with AI, and the answer has moved well beyond the games and chief-of-staff tool from earlier in the season. What emerges is a practical picture of how a non-engineer is now creating real, deployable software simply by describing what she wants in plain language. Holly's flagship build is what she calls a "leadership bench." Rather than the familiar gimmick of stacking an imaginary board with Steve Jobs and Bill Gates, hers lets you select genuine executive roles, a CFO, a CMO, a CTO, feed in a problem statement, and have each perspective argue, counter and vote across rounds. The aim is to surface what real leadership teams so often leave unsaid. Drawing on her coaching work, Holly explains that not all voices carry the same volume in a room, and politics and groupthink keep people from speaking honestly. The tool strips the emotion out of charged decisions, from new-market entry to return-to-office, and is already being piloted with clients and used inside her own business. The conversation is refreshingly practical about how this is done. Holly built it by describing the problem and the experience she wanted in natural language; Ewan's useful framing is that the "science bit" isn't the coding anymore but the thinking, the careful briefings, the honing, the packaging of judgement into something reusable. They also dig into why it matters that the tool runs on a server rather than a laptop, and the serious-business considerations many casual users miss: data residency laws that require client data to stay in-region, and enterprise or API keys that keep confidential inputs out of model training. Holly's UAE-based server lets her run lean, confidential pulse surveys for clients during a tense period, a genuinely commercial use case. The episode's second big idea is context. Holly has built a simple tool that interviews you about your career, ambitions and preferences, then generates a portable context document, ideal for newcomers, for people switching tools for ethical reasons, or for anyone who has used AI organically without ever intentionally teaching it who they are. Stored locally and kept private, it has even doubled as a reflective, goal-setting exercise. The payoff, she argues, is an AI that finally challenges you in the right register rather than being relentlessly polite, and that becomes dramatically more powerful when paired with agentic tools like OpenClaw, knowing your life and work from the outset like an old friend rather than a stranger at a party. Her closing advice is blunt: the free tools won't give you this, so invest in yourself and upgrade. Key Topics * A "leadership bench" tool that simulates executive perspectives * Using AI to break groupthink and surface unspoken views * Building deployable software through natural language alone * Why server hosting, not a laptop, makes tools shareable and secure * Data residency and enterprise keys for confidential client work * A context tool that teaches AI who you are * Portable context documents when switching between AI tools * How rich context supercharges agentic tools like OpenClaw

1 de may de 202621 min
episode Episode 18: Drones, Defence and Misinformation artwork

Episode 18: Drones, Defence and Misinformation

This is a different kind of episode. Holly Joint joins from a region now living under daily missile alerts, and the conversation with Ewan MacLeod turns from the usual workplace-and-AI territory to something far more immediate: how technology shapes life, safety and truth in a conflict zone. It is, by both hosts' admission, a difficult subject, but one they feel matters too much to skip. Holly describes a striking asymmetry in modern warfare. On one side, cheap, low-tech drones crossing overhead several times a day; on the other, a sophisticated, AI-enabled defence system that calculates trajectories, identifies interception points and responds in extraordinarily short windows, always, she stresses, with a human in the loop. Living beneath it, she explains how that technology translates into a genuine sense of safety, and how the household adapts: honest but calm conversations with the children, reframing the frightening boom of an intercept as the sound of a missile stopped and everyone kept safe. A recurring theme is information itself. In wartime, Holly notes, misinformation and propaganda flood WhatsApp groups and social feeds, and one of the smartest uses of technology she's seen is a simple web app that aggregates only official sources, the government media office, ministry of defence, crisis management, into a single trusted place to check rumours against. Alongside this, she points to the quiet rise of low-cost AI therapy tools helping people cope, because living under missiles is not normal, however well one carries on. The episode has its lighter human moments too: Holly's 3am backup-battery purchases during sleepless nights, which turned out more useful against thunderstorms than the war, and Ewan's enthusiasm for Starlink, both as a home backup and, more seriously, as genuinely transformative infrastructure. They touch on its life-or-death role in conflicts like Ukraine and Iran, and how connectivity can boost economies that lack reliable infrastructure. The conversation closes on the hardest question of all: the ethics of AI in warfare. Holly raises Anthropic's decision to restrict how its tools may be used, and the consequence of being removed from a US Department of War supplier list, a move both hosts find genuinely significant. They circle back to a book referenced in an earlier episode, "If Anyone Builds It, Everyone Dies," and to autonomous weapons, computer vision targeting, and the danger of AI's misplaced certainty in contexts where a wrong answer costs lives. Both land firmly in the same place: humans must stay in the loop, and far more work is needed to understand the consequences. Key Topics * The asymmetry of cheap drones versus high-tech AI defence * How AI-enabled interception systems work, with a human in the loop * Living and parenting calmly under daily missile alerts * Combating wartime misinformation by aggregating trusted sources * Low-cost AI therapy tools for people under stress * Starlink as resilient, sometimes life-or-death, connectivity * Anthropic's use restrictions and removal from a US supplier list * The ethics of autonomous weapons and AI certainty in warfare Links & References * Starlink — https://www.starlink.com [https://www.starlink.com] * Anthropic — https://www.anthropic.com [https://www.anthropic.com] * If Anyone Builds It, Everyone Dies (Eliezer Yudkowsky & Nate Soares) — https://ifanyonebuildsit.com [https://ifanyonebuildsit.com]

24 de abr de 202616 min
episode Episode 17: When Big Tech Pulls the Plug artwork

Episode 17: When Big Tech Pulls the Plug

Some technologies fail not because they don't work, but because the world never quite wants them, or because the bill simply never makes sense. In this episode Holly Joint and Ewan MacLeod use a striking month of tech news as a jumping-off point to ask what it really takes for a technology to survive, and why even brilliant, well-funded ideas end up in the graveyard. The numbers do a lot of the talking. Holly walks through the staggering economics of OpenAI's Sora video tool, burning enormous sums daily in operating and inference costs while its lifetime revenue came in at a tiny fraction of that. People made memes; almost nobody paid. Set against Meta's reported $70-80 billion poured into the metaverse over five years, the contrast is instructive: Sora was shut down fast, while the metaverse limped on for years under the weight of sunk-cost thinking before anyone was brave enough to call time. That bravery becomes a quiet theme. Knowing when to stop, both hosts agree, is one of the hardest things a company can do, and there's something admirable in the decision to write off a beloved bet. The conversation broadens into a tour of the technology graveyard, Google Glass, which demanded a behaviour change people never accepted, and Concorde, technically magnificent and much loved but never viable. Ewan's affection for Concorde is genuine; he argues we are poorer as a society without it, even as he shrugs at Sora's passing. Not every dead technology is mourned equally. Underneath the news sits a sharper observation: the ability to build something is not the same as people wanting it. Holly returns to a smart fridge she saw prototyped back in 1997, a technology that exists today yet still has barely any adoption, because people don't actually want their fridge ordering the milk. The metaverse, she argues, has the same problem. The technology was never the obstacle; societal and user adoption was. Tellingly, the one place the hosts see real uptake is gaming, with Ewan describing his own children's enthusiasm for VR headsets and games like Job Simulator, a long way from Zuckerberg's vision of a virtual social future. The episode also touches on the wider competitive picture, the perception of Claude as the serious, enterprise-grade choice while OpenAI burns cash chasing consumer attention, and the difficulty of finding hard data to back any of it up. But the closing note is generous rather than cynical. We need the dreamers, the hosts conclude, and the willingness to make big, bold bets that will sometimes fail, because that aspiration is part of what makes us human. Key Topics * The brutal economics behind Sora's rapid shutdown * Sunk-cost thinking and Meta's multi-billion metaverse bet * The courage required to write off a major project * A tour of the tech graveyard: Google Glass and Concorde * Why building a technology doesn't guarantee adoption * The smart fridge problem: capability versus genuine demand * VR finding a home in gaming rather than work * Claude's enterprise perception versus OpenAI's consumer focus

17 de abr de 202622 min