David's Saturday AI Thoughts

Average by default

12 min · 20 de jun de 2026
Portada del episodio Average by default

Descripción

WHAT HAPPENED THIS WEEK * A Munich court told Google that "may contain errors" is no defence.: The Munich Regional Court ruled Google liable for false claims in its AI Overview, which wrongly tied two German publishers to fraud, holding that an AI summary generates fresh substantive statements rather than just curating sources, and that the small-print disclaimer does not transfer liability back to the user. Google must remove the answers and pay 80% of costs. Every major provider leans on the same footer to cover confidently wrong answers; a court has now said it does not work, so any organisation publishing AI output to customers, regulators or staff can no longer assume the disclaimer protects them. * Commercially available AI flagged early breast cancer six years before clinical diagnosis.: A Karolinska Institute study in Radiology found off-the-shelf AI systems flagged early warning signs of breast cancer in roughly 20% of patients a full six years before clinical diagnosis at 90% specificity, rising to around 40% at two years out, across 88,963 mammograms from more than 31,000 patients, with three of the tools tested already commercially available. Pattern recognition at scale is what AI does best; here it buys years of warning on a disease where early detection saves lives, and the bottleneck is no longer capability but deployment, trust and who owns the answer. * Anthropic studied 400,000 coding sessions: the tool is levelling people and moving the real gap up to firms.: Across roughly 235,000 people and the ten largest professions, success rates landed within seven points of professional engineers and managers came out ahead; an accountant who had never written Python but knew which rule a month-end reconciliation had to enforce was rated an expert, while a senior engineer on an unfamiliar task was not. What carries a session is understanding the problem, not the craft, so the gap between people is closing. The part everyone skips is that it has not vanished but moved up a level, from between people to between firms. WHAT TO TRY * Tell the model what's at stake before you ask the question.: In a coaching session a senior leader asked Claude about a government-policy question and got a confident, generic, thin answer. Adding one sentence, "The answer is critical. Provide authoritative sources," changed everything: the model switched register, went to the primary regulatory documents, quoted the relevant sections and linked each claim back so the human could check it. Nothing about the question had changed; the model simply hadn't known the stakes. * Stamp every AI output "raw, not yet checked" until you've taken ownership.: Mentoring Ethan, Steadman's placement-year researcher, the rule landed for any AI-created document: it's either raw AI output or something you've checked, edited and will stand behind. Set your tool to auto-stamp every output "raw AI output, not yet checked, edited and owned," and leave it there. Removing the stamp by hand becomes the deliberate act of taking ownership, so anyone who later picks up or is forwarded the file knows exactly which of the two things they're holding. * Book a meeting with yourself, hit transcribe, ramble for fifteen minutes.: A senior advisor said their real value sits as strong points of view in their head, none of it written down. The fix: put a meeting in your own diary, hit transcribe, put your feet on the desk, look out of the window, and talk, without trying to be structured. Change your mind, go down rabbit holes, tell stories. The transcript becomes a thesis document you can feed any model so future answers come back already loaded with your frame, your caveats and your taste. Read the full edition with all links and sources [https://steadman.ai/newsletters/david/#edition-2026-06-20]

Comentarios

0

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de David's Saturday AI Thoughts!

Empezar

2 meses por 1 €

Después 4,99 € / mes · Cancela cuando quieras.

  • Podcasts exclusivos
  • 20 horas de audiolibros / mes
  • Podcast gratuitos

Todos los episodios

18 episodios

Portada del episodio Average by default

Average by default

WHAT HAPPENED THIS WEEK * A Munich court told Google that "may contain errors" is no defence.: The Munich Regional Court ruled Google liable for false claims in its AI Overview, which wrongly tied two German publishers to fraud, holding that an AI summary generates fresh substantive statements rather than just curating sources, and that the small-print disclaimer does not transfer liability back to the user. Google must remove the answers and pay 80% of costs. Every major provider leans on the same footer to cover confidently wrong answers; a court has now said it does not work, so any organisation publishing AI output to customers, regulators or staff can no longer assume the disclaimer protects them. * Commercially available AI flagged early breast cancer six years before clinical diagnosis.: A Karolinska Institute study in Radiology found off-the-shelf AI systems flagged early warning signs of breast cancer in roughly 20% of patients a full six years before clinical diagnosis at 90% specificity, rising to around 40% at two years out, across 88,963 mammograms from more than 31,000 patients, with three of the tools tested already commercially available. Pattern recognition at scale is what AI does best; here it buys years of warning on a disease where early detection saves lives, and the bottleneck is no longer capability but deployment, trust and who owns the answer. * Anthropic studied 400,000 coding sessions: the tool is levelling people and moving the real gap up to firms.: Across roughly 235,000 people and the ten largest professions, success rates landed within seven points of professional engineers and managers came out ahead; an accountant who had never written Python but knew which rule a month-end reconciliation had to enforce was rated an expert, while a senior engineer on an unfamiliar task was not. What carries a session is understanding the problem, not the craft, so the gap between people is closing. The part everyone skips is that it has not vanished but moved up a level, from between people to between firms. WHAT TO TRY * Tell the model what's at stake before you ask the question.: In a coaching session a senior leader asked Claude about a government-policy question and got a confident, generic, thin answer. Adding one sentence, "The answer is critical. Provide authoritative sources," changed everything: the model switched register, went to the primary regulatory documents, quoted the relevant sections and linked each claim back so the human could check it. Nothing about the question had changed; the model simply hadn't known the stakes. * Stamp every AI output "raw, not yet checked" until you've taken ownership.: Mentoring Ethan, Steadman's placement-year researcher, the rule landed for any AI-created document: it's either raw AI output or something you've checked, edited and will stand behind. Set your tool to auto-stamp every output "raw AI output, not yet checked, edited and owned," and leave it there. Removing the stamp by hand becomes the deliberate act of taking ownership, so anyone who later picks up or is forwarded the file knows exactly which of the two things they're holding. * Book a meeting with yourself, hit transcribe, ramble for fifteen minutes.: A senior advisor said their real value sits as strong points of view in their head, none of it written down. The fix: put a meeting in your own diary, hit transcribe, put your feet on the desk, look out of the window, and talk, without trying to be structured. Change your mind, go down rabbit holes, tell stories. The transcript becomes a thesis document you can feed any model so future answers come back already loaded with your frame, your caveats and your taste. Read the full edition with all links and sources [https://steadman.ai/newsletters/david/#edition-2026-06-20]

20 de jun de 202612 min
Portada del episodio Ride the bike

Ride the bike

WHAT HAPPENED THIS WEEK * One activist letter froze a board's AI workspace overnight. The doctors are next.: From David's week helping a hundred non-executive directors: one board's AI workspace went dark the morning an activist investor asked for its contents to be made discoverable. The Conference Board's April survey finds two-thirds of directors use AI for board work while barely a quarter of executives call their board highly fluent, and the Medical Protection Society warned that doctors and the NHS could be sued over AI tools' mistakes, with the clinician left as the 'liability sink'; it wants AI reclassified as a product under the Consumer Protection Act 1987 so liability flows to developers. The governance frontier is moving from 'can the data leak' to 'who owns the answer when the model is wrong', and the CEO'd policy (every output checked, edited and owned by a named human) is the only good answer to both the letters and the writs. * Meta built a 'second brain' that 63,000 staff installed in three months. It started with one person.: Meta's analytics team reports that an internal AI tool one of its data scientists started has been installed by 63,000 employees within three months, with no top-down mandate and no transformation programme. One person built something useful and the rest of the company found it. The standing question for readers: what are you doing to encourage and enable this at your firm? * When cheap models do make sense.: Ethan Mollick argues for hierarchies in which smart models supervise cheap ones: the smart one checks the plan, the cheap one does the volume. Right for machine pipelines running thousands of low-stakes calls; for your own judgement work, the essay's time-saved maths says buy the best. The item is the deliberate counterweight to the essay's argument, marking where it does and does not apply. WHAT TO TRY * Ask the AI to write the marking scheme before it writes the answer.: Ben Yoskovitz, a non-developer who ships production software with AI, has the model write three to five specific pass-or-fail checks before any non-trivial task, approves them, and then has the AI do the job, grade itself against each check, cite the evidence and stop the moment one fails. Ninety seconds of discipline against the long loop of 'looks good' followed by 'wait, no it isn't'. * Plan in one session, then start a fresh one to build.: Tiago Forte treats the first AI session as planning-only when the goal is a concrete document or deck: work out the brief in chat one, never let it start drafting, then open chat two, paste the brief and build. A clean start gives a sharper model, the deliberate stop tests whether the brief holds together, and a crash in session two still leaves you with the brief. * Keep your AI project lean.: Someone at a training David ran asked why their Claude project had slowed over a few weeks; they had been piling reference files into it. The model reads everything in a project on every request, so book fifteen minutes on a Friday to prune, and never leave fat PDFs in there: have the model convert each one to plain text first, since PDFs burn far more reading capacity than the same content as text. Read the full edition with all links and sources [https://steadman.ai/newsletters/david/#edition-2026-06-13]

13 de jun de 202613 min
Portada del episodio The open door

The open door

WHAT HAPPENED THIS WEEK * The CEO of a 350,000-person IT services firm says AI is hollowing out the middle, not the bottom.: Ravi Kumar, chief executive of Cognizant, told Fortune's COO Summit on 1st June that his company hired 20,000 entry-level graduates last year and expects to hire more in 2026, with new 'Frontier Business Operator' and 'Frontier Certified Engineer' roles defining AI-era work. He called job-extinction talk 'fearmongering' and argued AI thins middle management while entry-level and leadership roles persist. It's a direct counter to the consensus that entry-level work vanishes first — including the US Bureau of Labor Statistics data Edition 14 leaned on — and a real-world data point for the essay's bet on graduates. * The machine is writing the code now, and the gains are pooling at the top.: Tobi Lütke says one in eight pull requests merged at Shopify are now written by River, its in-house agent, not an engineer. Anthropic's own engineers ship roughly eight times the code per person they did before 2025. Cursor's developer report shows the output gap widening, with top developers pulling far ahead of the median. And OpenAI's Codex has passed five million weekly users, with non-developer adoption growing three times faster than developer adoption. The grunt of writing code is moving to the machine, the output is multiplying, and the reward is concentrating in the people who know what to ask of it. * Capability is outrunning even the best forecasters.: The Forecasting Research Institute asked expert forecasters and superforecasters how long a task a model would reliably finish by the end of 2026, measured on METR's time-horizon benchmark (about an hour and a half when the survey launched). All three groups put the end-of-2026 figure between three and four hours. Then, while the survey was still running, a frontier model in preview hit three hours and six minutes — already inside the range they'd picked for year-end. The forecast was overtaken before they'd finished making it. Cloudflare's Matthew Prince made the same public miss: bots passed humans in web traffic for the first time, years ahead of his own late-2027 estimate, though much of that is scraping rather than agents, so the figure is softer than it sounds. WHAT TO TRY * Ask the AI to orient itself in your folder before you ask it anything else.: Setting up a Claude Cowork project for a documentary filmmaker new to the tool this week, David's first prompt wasn't about the work. It was: 'Read the files and sub-folders, write yourself a little set of instructions for future chats. Write yourself a navigation guide.' Claude explored, then saved itself three memory files — a project overview, a folder guide, and a profile of the user. Every later chat in that project started smarter because it could re-read its own notes. * Don't say 'always allow' on the verbs you can't undo.: In a coaching conversation about agent permissions this week, the familiar pattern: three prompts in, most people click 'always allow' on everything just to stop the interruption. The discipline is to sort the verbs first. Reading, listing, searching — leave on auto. Deleting, sending, posting, spending — keep asking every time. Sorting the verbs first is what keeps one careless click from emptying a folder, sending an email you can't unsend, or running up a charge you didn't mean. * Talk longer on the call, so the AI can work for hours after.: On a client call this week the brief changed mid-conversation. David deliberately used more words than he otherwise would, narrating the bridge between the old direction and the new one. The over-explaining was really for the AI: context he was laying down for it to pick up later, once the transcript was in the project. The transcript plus 'go' did the work. Next time you're on a call that's being transcribed, elaborate a little more and narrate the why, not just the what. Read the full edition with all links and sources [https://steadman.ai/newsletters/david/#edition-2026-06-06]

6 de jun de 202612 min
Portada del episodio How We Got Here

How We Got Here

WHAT HAPPENED THIS WEEK * AI can now find software vulnerabilities faster than humans can patch them. Discovery is no longer the hard part; verification is.: A frontier model handed to fifty cybersecurity partners (Anthropic's Glasswing initiative) surfaced more than ten thousand critical or high-severity vulnerabilities in the systems it was pointed at. Cloudflare has roughly four hundred major bugs to work through; Palo Alto Networks shipped five times more patches than its usual release cadence. Maintainers have asked the developers to throttle the discovery rate because there are not enough security professionals to close the gaps before attackers find them. Software security used to be limited by how fast new vulnerabilities could be found. It is now limited by how fast humans can verify, disclose and patch them. * A general-purpose AI model has autonomously disproved a 1946 conjecture in geometry. Independent mathematicians have verified the proof.: OpenAI handed a general-purpose reasoning model a long-held belief tied to a 1946 planar unit-distance problem of Erdos, the prolific Hungarian mathematician, and the model produced a disproof. Other AI models have since solved further long-standing problems; the wrinkle is that the others were purpose-built for mathematics. OpenAI's was not. Machines now clear the tractable tail of problems fast, pushing the human frontier towards what still resists them. After AlphaGo beat the world's best human Go players in 2016, the skill of human Go players noticeably improved. Noam Brown, an OpenAI researcher who helped build its reasoning models, suspects the same pattern will play out in maths. And then, perhaps, in business. * Generative AI use among American adults has hit 58 per cent in four years. The personal computer took sixteen years.: The Federal Reserve's February 2026 Real-Time Population Survey of working-age adults puts overall adoption at 58 per cent, up from around 45 per cent in October 2024 but recently flat. Work use is 44 per cent. Non-work use is 51. Daily use sits at 14 per cent and saves an estimated 2.2 per cent of total work hours. Alfred Lin, a partner at Sequoia Capital, notes this is the penetration level the personal computer took sixteen years to reach: a four-fold acceleration on the closest analogue. The caveat is the plateau. The early-adopter phase is over. The hard part starts. WHAT TO TRY * Ask the model what else it needs to know: The most productive sessions David sees with senior leaders don't open with a clever prompt. They open with the user pasting context — role, situation, what they're trying to do — and then asking two questions. First: "what else do you need to know about me to help me well?" Second: "what could you do for me right now that I haven't asked for?" The first surfaces gaps the user wouldn't have spotted. The second produces use cases the user didn't bring. The blank-prompt paralysis dissolves. * Either AI challenges you at the start, or you challenge AI at the end. Don't skip the challenge.: A leader David sat with this week had let an AI output stand without pushing back. Two patterns work and one fails. You can challenge before the model starts: share your point of view and ask the model to challenge it, force it to surface holes and the strongest counter-argument. Or you can challenge after the first draft: force a rewrite, name what's wrong, make it earn the second pass. The pattern that fails — and the one David sees most — is read-and-accept. * Ramble into the microphone, let the machine find the structure: A managing director David coached this week outlined obsessively: 700 words of outline for a 1,500-word article. The cost of structuring before writing was eating his weekend. The fix was inverting the order. Pick up your phone, dictate the mess, paste the transcript into your model, ask it for the through-line. Structure becomes the cheap thing. Particularly powerful for executives who think by talking — more common at the top than people admit. Read the full edition with all links and sources [https://steadman.ai/newsletters/david/#edition-2026-05-30]

30 de may de 202613 min
Portada del episodio Kids these days

Kids these days

WHAT HAPPENED THIS WEEK * AI displacement now shows up in the US government data at both ends of the career ladder: A Bloomberg analysis of new BLS figures finds every one of the eighteen occupations the BLS classifies as AI-exposed has lost jobs over the past year, even as US payrolls grew 0.8% overall. Customer service representatives shed 130,180 jobs, 4.8% in a single year; interpreters down 24% over three years; credit authorizers down 26%. The exception that confirms the rule: medical secretaries up 15.8%, the cluster that needs a body in the room. The same picture shows up at the other end of the funnel. The Economist this month plotted US graduate full-time employment against AI exposure: computer science and information sciences graduates are down 10 to 15 percentage points since 2022; philosophy and psychology graduates held steady or gained. The displacement isn't just to the people already doing those jobs. It's to the people trying to start in them, and what they should be studying may not be obvious to anyone yet — a thread the essay returns to via Elliott's homework. * The UK's data regulator has put AI hiring tools on formal notice. Sixteen organisations have already had a letter: The Information Commissioner's Office issued formal guidance this week saying that AI-driven CV screening, candidate ranking, and video interview analysis without "meaningful human involvement at every consequential stage" may already breach UK data protection law. Sixteen organisations have been written to directly. The consultation closes on 29th May, six days after the edition lands. A concrete Monday-morning task for any leader running a hiring pipeline: get the full list of AI tools in use across the funnel, decide which involvements count as "meaningful" against the ICO's test, and put a response into the consultation. The window is genuinely short. * Salesforce will spend close to $300 million with Anthropic this year. Marc Benioff says the engineering productivity gains made it the easiest line in the budget: Marc Benioff disclosed that Salesforce is on track to spend close to $300 million with Anthropic over 2026, with most of the spend on coding, justified by engineering productivity gains of more than 30%. Separately, Anthropic announced a $200 million partnership with the Gates Foundation focused on global health. A Fortune 100 chief executive treating the model layer as a procurement line item, not a research expense. The bigger question is who in your firm is allowed to commit that kind of capital, against what kind of evidence, and how quickly. WHAT TO TRY * When the output goes wrong, shrink the task: Justin Skycak put it as a principle for skill acquisition this week: shrink the unit of practice until the mistake has nowhere to hide. The same rule applies to working with language models. Sprawling prompts produce sprawling failures you can't diagnose. Break the task into its smallest meaningful unit, run it, inspect the output, then rebuild. If you can't immediately see where it went wrong, your chunk is still too large. * Ask AI questions it can't possibly know the answer to: A marketing lead at a global firm told David this week she's running a five-minute stress-test on every AI tool she's thinking of trusting. She uploads her own data, asks the model to use only that data, then asks it questions she knows the data can't answer. Some models fabricate regardless ("53% of women in the northeast states feel..."). She's learned what its confident-but-wrong mode looks like before depending on it for an answer she can't independently check. Worth doing once on every tool you rely on. * Run your day past AI before you start it: A senior leader described her commute habit to David this week. She opens Claude, asks it to review her calendar and her email, then asks it to surface what she needs to read before each meeting, what's carrying over from yesterday, and which emails in her inbox need replies before the day eats her. Five minutes on the train, and the day is scoped from outside her own head. "Just a nice little daily habit," she said. Try it tomorrow. Read the full edition with all links and sources [https://steadman.ai/newsletters/david/#edition-2026-05-23]

23 de may de 202611 min