AI Tools for Practicing Lawyers

Episode 016: Can an AI Judge Train You Better Than the Real One?

45 min · I går
episode Episode 016: Can an AI Judge Train You Better Than the Real One? cover

Beskrivelse

What if you could lose your case to an AI judge tonight, so you don't lose it to a real one tomorrow? For generations, lawyers learned advocacy the hard way: draft the brief, argue the motion, get knocked down by the judge, learn why you were wrong. Litigation partner Chris Ryan [https://www.taftlaw.com/people/christopher-j-ryan/] built a different path. BenchSim AI [https://www.benchsimai.com/] lets lawyers upload their brief and opposing counsel's brief, then argue out loud in real time against an AI judge who pushes back, interrupts, and grades the performance. The question this episode keeps circling: is this the future of how lawyers get their reps in, or is the courtroom apprenticeship something AI can never actually replace? In this episode: * Why COVID permanently reduced young lawyers' opportunities to argue in front of a real judge, and what that "reps problem" means for the next generation of litigators * How Chris built BenchSim AI as a litigation partner with no modern coding background, using "vibe coding" to go from idea to working product in about six weeks * What vibe coding actually is, and why it's the same process as building a custom GPT, Gem, or Claude skill * How BenchSim works: upload your brief and opposing counsel's brief, choose a judge temperament (quiet, neutral, or hot bench), and argue out loud * Why BenchSim deliberately skips video rendering of the judge to avoid latency that would kill the realism of rapid-fire argument * How the AI judge develops counterpoints from the opposing brief and is programmed to interrupt when an advocate is talking in circles * The SOC 2 certification process BenchSim is going through before marketing to law firms, and why that matters for adoption * Whether AI will actually save lawyers time, including the "airport test" framework for evaluating whether a tool is worth the overhead * Using AI as an adversary instead of a cheerleader — prompting it to argue against your own complaint or brief before opposing counsel does * The Flintstones/Simpsons/Jetsons breakdown of how to stress-test a brief at every level of AI adoption We also discuss: * Chris's recent "wow moment" using AI to play defense counsel against his own drafted complaint * Heather's experience having Claude Code build software overnight while she sleeps * Whether AI simulation training could expand beyond litigation into bar exam prep and other legal training * A Practice Signal segment on a deeply inappropriate mentorship moment a young associate experienced, and whether AI could have helped an older partner communicate the underlying (legitimate) concern without the inappropriate framing * Chris's plans for BenchSim's feature roadmap, including potential expansion into opening statements and direct examination practice Key Takeaway Availability is not authority, and a simulation is not a verdict. BenchSim doesn't tell a lawyer whether they'll win or lose; it tells them where their argument is weak before a real judge finds out for them. That distinction matters. The value isn't in the AI replacing judgment, it's in creating reps that don't exist anymore because courtrooms don't generate them the way they used to. This episode lands differently depending on where you sit on the FSJ spectrum. A Flintstones lawyer can start by asking any AI tool to summarize their argument and flag weaknesses. A Simpsons lawyer can go further, prompting AI to act as opposing counsel and attack the brief. A Jetsons lawyer is already running full bench simulations, treating AI as an adversary that prepares them for the real fight rather than a cheerleader that tells them what they want to hear. Mentioned in This Episode: * BenchSim AI (benchsimai.com) [https://www.benchsimai.com/] * Taft (Taft Stettinius & Hollister) * Harvey * Legora * Anthropic Claude / Claude Code * ChatGPT * Reddit (Practice Signal segment source) * MPRE (Multistate Professional Responsibility Examination) info@drescherlaw.com [info@drescherlaw.com]

Kommentarer

0

Vær den første til at kommentere

Tilmeld dig nu og bliv en del af AI Tools for Practicing Lawyers-fællesskabet!

Kom i gang

1 måned kun 9 kr.

Derefter 99 kr. / måned · Opsig når som helst.

  • Podcasts kun på Podimo
  • 20 lydbogstimer pr. måned
  • Gratis podcasts

Alle episoder

29 episoder

episode Episode 016: Can an AI Judge Train You Better Than the Real One? cover

Episode 016: Can an AI Judge Train You Better Than the Real One?

What if you could lose your case to an AI judge tonight, so you don't lose it to a real one tomorrow? For generations, lawyers learned advocacy the hard way: draft the brief, argue the motion, get knocked down by the judge, learn why you were wrong. Litigation partner Chris Ryan [https://www.taftlaw.com/people/christopher-j-ryan/] built a different path. BenchSim AI [https://www.benchsimai.com/] lets lawyers upload their brief and opposing counsel's brief, then argue out loud in real time against an AI judge who pushes back, interrupts, and grades the performance. The question this episode keeps circling: is this the future of how lawyers get their reps in, or is the courtroom apprenticeship something AI can never actually replace? In this episode: * Why COVID permanently reduced young lawyers' opportunities to argue in front of a real judge, and what that "reps problem" means for the next generation of litigators * How Chris built BenchSim AI as a litigation partner with no modern coding background, using "vibe coding" to go from idea to working product in about six weeks * What vibe coding actually is, and why it's the same process as building a custom GPT, Gem, or Claude skill * How BenchSim works: upload your brief and opposing counsel's brief, choose a judge temperament (quiet, neutral, or hot bench), and argue out loud * Why BenchSim deliberately skips video rendering of the judge to avoid latency that would kill the realism of rapid-fire argument * How the AI judge develops counterpoints from the opposing brief and is programmed to interrupt when an advocate is talking in circles * The SOC 2 certification process BenchSim is going through before marketing to law firms, and why that matters for adoption * Whether AI will actually save lawyers time, including the "airport test" framework for evaluating whether a tool is worth the overhead * Using AI as an adversary instead of a cheerleader — prompting it to argue against your own complaint or brief before opposing counsel does * The Flintstones/Simpsons/Jetsons breakdown of how to stress-test a brief at every level of AI adoption We also discuss: * Chris's recent "wow moment" using AI to play defense counsel against his own drafted complaint * Heather's experience having Claude Code build software overnight while she sleeps * Whether AI simulation training could expand beyond litigation into bar exam prep and other legal training * A Practice Signal segment on a deeply inappropriate mentorship moment a young associate experienced, and whether AI could have helped an older partner communicate the underlying (legitimate) concern without the inappropriate framing * Chris's plans for BenchSim's feature roadmap, including potential expansion into opening statements and direct examination practice Key Takeaway Availability is not authority, and a simulation is not a verdict. BenchSim doesn't tell a lawyer whether they'll win or lose; it tells them where their argument is weak before a real judge finds out for them. That distinction matters. The value isn't in the AI replacing judgment, it's in creating reps that don't exist anymore because courtrooms don't generate them the way they used to. This episode lands differently depending on where you sit on the FSJ spectrum. A Flintstones lawyer can start by asking any AI tool to summarize their argument and flag weaknesses. A Simpsons lawyer can go further, prompting AI to act as opposing counsel and attack the brief. A Jetsons lawyer is already running full bench simulations, treating AI as an adversary that prepares them for the real fight rather than a cheerleader that tells them what they want to hear. Mentioned in This Episode: * BenchSim AI (benchsimai.com) [https://www.benchsimai.com/] * Taft (Taft Stettinius & Hollister) * Harvey * Legora * Anthropic Claude / Claude Code * ChatGPT * Reddit (Practice Signal segment source) * MPRE (Multistate Professional Responsibility Examination) info@drescherlaw.com [info@drescherlaw.com]

I går45 min
episode AI Builds: Ground Zero cover

AI Builds: Ground Zero

Are "build an AI agent in 20 minutes" ads lying to you?  Ron spent way more than 12 hours trying to find out. Ron set out to build the simplest possible AI legal workflow — a motion to extend time to file bankruptcy schedules — and discovered that "easy" and "AI-assisted" don't always mean the same thing. In this episode: * Why Ron decided to test AI workflow-building himself instead of buying a course * The 10-step workflow development cycle ChatGPT walked him through, from defining objectives to platform optimization * Testing the same workflow across Claude, ChatGPT, Gemini, and Copilot * Why Google Docs + Gemini unexpectedly produced the best output of any platform tested * Why Copilot's deep integration with Word actually made it harder to use, not easier * ChatGPT's pushback on regenerating an entire document — and the hospital-acquired-infection analogy Ron uses to explain AI "drift" * The lesson that mature AI users isolate and fix specific problems instead of regenerating whole documents * Why Ron avoided using Claude for this project due to a prior usage-limit experience on the Pro tier * The "airport test" from Ron's prior Field Note, Confessions of an AI Hallucinator, and how it applies to building workflows * The free "Motion to Extend Lite" workflow Ron is releasing publicly We also discuss: * How persistent AI workflows (skills, agents, StrongSuit-style systems) represent a brand-new product category * The decision points Ron had to map for an "omnibus" extension workflow covering any deadline type * Quality control checks built into the workflow — missing dates, missing deadlines, missing cause language * How Ron accidentally discovered this entire approach through his own podcast post-production process * Why bare-bones bankruptcy petitions create the exact problem this workflow solves * The risk of inconsistent details (names, dates, captions) when reusing forms across cases — and how AI reduces that risk Key Takeaway Building a workflow that actually works — across platforms, with minimal user friction, passing real-world testing — is a fundamentally different job than writing a clever prompt. The platform you choose matters as much as what you ask it to do, and the AI tool that's "supposed" to be best for the job (Copilot in Word, Claude for heavy drafting) isn't always the one that delivers. For Flintstones lawyers, this episode is proof that a usable AI workflow can exist without them building anything — Ron's free download does the work. Simpsons lawyers will recognize the platform-testing process as the real work of AI adoption. And Jetsons lawyers will appreciate the granular lesson on regeneration risk and isolating fixes rather than reprocessing entire documents. Mentioned in This Episode * ChatGPT * Claude (Pro tier) * Google Gemini (Google Docs integration, Enterprise tier) * Microsoft Copilot * StrongSuit * Confessions of an AI Hallucinator [https://lawyeraitoolkit.com/confessions-of-an-ai-hallucinator] (prior Field Note episode — airport test) * Team Accelerator [https://completebankruptcy.thrivecart.com/team-accelerator-firm/] (Ron's bankruptcy training course) * Motion to Extend Lite [https://lawyeraitoolkit.com/motion-to-extend-time-workflow] (free workflow download) > info@drescherlaw.com [info@drescherlaw.com]

12. juni 202624 min
episode Field Note: Can Apple Intelligence Pass the File Cabinet Test? cover

Field Note: Can Apple Intelligence Pass the File Cabinet Test?

SHOW NOTES NOTE: The File Cabinet Test is used to determine whether AI can discern the contents of every file in a connected drive. The star of Apple's WWDC 2026? Apple Intelligence. Apple announced how Apple Intelligence is being incorporated into Siri. This prompts us to ask: Can Apple Intelligence finally make workflow automation accessible to ordinary lawyers? In this episode: * Why Apple's AI strategy isn't about building the smartest model — it's about removing friction * How natural language Shortcuts could lower the barrier to automation for lawyers who've never written a line of code * Whether iCloud can finally compete with Google Drive and OneDrive as an AI-connected file system * Apple's private cloud compute architecture and why privacy is not the same as privilege * Why the real story from WWDC 2026 may be the democratization of workflow automation We also discuss: * Ron's history as an Apple fanboy since the 1980s — and why this keynote actually got his attention * Steve Jobs' "best camera" line and what it means for AI adoption * Ron's existing Claude-based podcast workflow as a real-world example of what natural language Shortcuts could replicate * The fact that none of these features have shipped yet — these are observations, not a review * Siri's rebranding struggle and Apple's decision to stick with the name despite years of ridicule Key Takeaway The biggest AI story from WWDC 2026 isn't that Apple built the smartest AI. It's that Apple appears to be trying to make AI invisible — woven directly into the operating system rather than siloed in a separate app or chatbot (although there will be a stand-alone chat style Siri app). If natural language Shortcuts works the way it was demonstrated, lawyers may be able to build automation workflows simply by describing what they want in plain English. That's not a Jetsons story. That's a Simpsons story — and maybe even a Flintstones story. Prompting is useful, but workflows are transformational. If Apple pulls this off, the gap between lawyers who automate and lawyers who don't may finally start to close. Mentioned in This Episode: * Apple Intelligence * WWDC 2026 * Shortcuts (Apple) * Siri * iCloud * Finder * Gemini (Google) * Claude * ChatGPT * Google Drive * OneDrive / Microsoft * Apple Private Cloud Compute * The File Cabinet Test (from Episode 007 FolderMania) [https://lawyeraitoolkit.com/episode-007-folder-mania-ai-comes-to-you] info@drescherlaw.com [info@drescherlaw.com]

8. juni 20269 min
episode Episode 015: The Last Flintstones Lawyer cover

Episode 015: The Last Flintstones Lawyer

Most conversations about AI and legal writing focus on the tools. This one focuses on the lawyers. What does a Flintstones lawyer actually do on Monday morning after they've finally decided to move? What does a Simpsons lawyer do when they discover their favorite tool isn't safe for client data? And what happens when the question isn't whether to adopt AI — but whether you'll survive professionally if you don't? Ron sits down with co-host Heather Gardner and Maryland attorney and legal educator Donna Mandl to work through the questions practicing lawyers are actually asking — the ones that never make it into the marketing decks. In this episode: * Why a Flintstones lawyer's first move should be calling their Westlaw or Lexis rep — not downloading a new app * Why Heather recommends Enterprise ChatGPT as the right entry point for lawyers handling client information * The "Claude as cleanup hitter" workflow: how to keep confidential work in a secure tool and bring the output to Claude for drafting * Gemini's hidden advantage — and why most lawyers are using it wrong * The hallucination problem everyone's talking about — and why fake citations aren't the real crisis * The subtler risk: cases that exist but don't say what AI claims they say, and standards of review that get quietly swapped * Donna's paralegal education pivot — from policing AI use to training students to audit what AI produces * Ron's prediction: by end of this decade, there will be no more Flintstones-level lawyers We also discuss: * What Heather and Donna are presenting at the Maryland State Bar Association's Legal Summit panel — Ethics, Accuracy, and Efficiency: AI in Legal Writing * Why judges are getting frustrated with both pro se AI filings and inaccurate AI-assisted briefs from lawyers * Ron's Claude experiment: feeding Claude its own 21 hallucination types and asking how many the new model would fix (14 of 21 — 7 remain hard) * The FSJ-client alignment theory: Flintstones clients are disappearing, and Flintstones lawyers will have to follow * FSJ-segmented follow recommendations: Dan Block (Flintstones), Ruben Hassid (Simpsons), Rich Rodgers (Jetsons) * The Practice Signal segment: can AI help a burned-out workers' comp lawyer find a new career? Key Takeaway The governance question for legal AI isn't philosophical anymore — it's a billing-line decision. Whether it's a $1,400-a-year Claude Enterprise commitment or a workflow choice about which tool sees client data, the lawyers who figure out the tiers will outpace the ones still treating a free-tier tool as a research platform. Availability is not authority — and neither is a consumer account. Flintstones lawyers who hear this episode have a clear Monday-morning move. Simpsons lawyers who've fallen for Claude but balked at the enterprise price now have a workaround. And Jetsons lawyers will recognize the gap is widening faster than most of their colleagues realize. Heather said it best: prompting got us from Flintstones to Simpsons. Learning to think and collaborate with AI is what takes you to Jetsons. Mentioned in This Episode * Heather Gardner — co-host, AI Tools for Practicing Lawyers * Donna Mandl — Maryland attorney; legal educator, Community College of Baltimore County (LinkedIn [https://www.linkedin.com/in/donna-mandl-ab7564120/]) * Shaun Koenig — Maryland attorney, MSBA Legal Summit panelist (LinkedIn [https://www.linkedin.com/in/shaun-koenig-7b44755/]) * Maryland State Bar Association Legal Summit [https://www.msba.org] — Ethics, Accuracy, and Efficiency: AI in Legal Writing panel * ChatGPT (OpenAI) — free tier and Enterprise tier * Claude (Anthropic) — Pro tier and Enterprise tier * Claude Code (Anthropic) * Harvey * Google Gemini — consumer and Enterprise ($36/month) tiers * Westlaw (Thomson Reuters) * LexisNexis * Fastcase * Rich Rodgers — prior guest, Episode 014; founder, StartupTechLaw (LinkedIn [https://www.linkedin.com/in/richrodgers360/]) * Dave Block — legal AI commentator; recommended follow for Flintstones lawyers (LinkedIn [https://www.linkedin.com/in/daveblockatty/]) * Ruben Hassid — Claude evangelist; recommended follow for Simpsons lawyers (LinkedIn [https://www.linkedin.com/in/ruben-hassid/]) * Field Note: 21 Ways AI Can Hallucinate in Your Legal Brief [https://lawyeraitoolkit.com/21-ways-ai-can-hallucinate-in-your-legal-brief] * Field Note: Tiers of the Clown [https://lawyeraitoolkit.com/tiers-of-the-clown]  * Mezu v. Mezu, Maryland Appellate Court No. 361 (2025) [https://www.mdcourts.gov/data/opinions/cosa/2025/0361s25.pdf] > info@drescherlaw.com [info@drescherlaw.com]

5. juni 202637 min
episode Field Note: Tiers Of The Clown cover

Field Note: Tiers Of The Clown

The question everyone is asking is wrong. When lawyers debate which AI tier is "smarter," they're arguing about a label — and labels end analysis. The better question isn't which tier is best. It's what capabilities am I actually buying, and whether those capabilities match the task in front of you. Ron's File Cabinet Test experiments proved this the hard way: Enterprise-tier AI passed tests his Plus account failed. But when it came time to brainstorm a podcast episode, he went right back to Plus. Not because it was smarter. Because it knew him. In this episode: * Why "smarter" is a label that ends analysis instead of starting it * The six capability categories that actually differentiate AI tiers: context window, retrieval, usage limits, connectors, memory, and governance * How Ron's File Cabinet Test revealed a material performance gap between Plus and Enterprise environments * Why the best AI tier is the one whose capabilities and accumulated context match the task — not the one with the highest price point * Why governance and capability are different things, and why you need both * The framework for evaluating AI tiers that survives even when pricing, features, and model names change We also discuss: * How a Reddit thread about AI tiers triggered Ron's thinking on this episode * Why Ron returns to his Plus account for podcast brainstorming even after seeing Enterprise outperform it * The companion handout for this episode (and why it may already be partially obsolete) * How ChatGPT, Claude, and Gemini package the same core capabilities in different ways * The analogy of the context window as the size of an AI's desk Key Takeaway Stop asking which AI is smartest. Start asking which capabilities matter for the task at hand. Retrieval isn't reasoning. Governance isn't performance. Context is accumulated over time, and a tool that knows your practice may outperform a more capable tool that doesn't. The best tier is the one aligned with what you're actually trying to do. For Flintstones lawyers, this episode removes the paralysis. You don't have to figure out which AI won. For Simpsons lawyers who've already paid for something, this is the framework for deciding whether they bought the right tier — or just the most expensive one. Jetsons lawyers will recognize the capability taxonomy immediately and probably already live by it. Mentioned in This Episode * ChatGPT (OpenAI) * Claude (Anthropic) * Gemini (Google) * Heather Gardner (co-host, Enterprise environment) * File Cabinet Test (Ron's benchmark framework) * Folder Mania [https://lawyeraitoolkit.com/episode-007-folder-mania-ai-comes-to-you] experiments * Three-Legged Stool (compliance framework — see Field Note: Building the Stool — How to Implement the AI Discovery Standards [https://lawyeraitoolkit.com/building-the-discovery-stool]) * Flintstones/Simpsons/Jetsons Framework * Companion handout (available at lawyeraitoolkit.com/deliverables [http://lawyeraitoolkit.com/deliverables]) * Reddit > info@drescherlaw.com [info@drescherlaw.com]

1. juni 20269 min