Duane Forrester Decodes

AI Search Runs on Two Memory Systems. The Platforms Don’t Use Them the Same Way.

14 min · I går
episode AI Search Runs on Two Memory Systems. The Platforms Don’t Use Them the Same Way. cover

Beskrivelse

Referenced in this episode: When the Training Data Cutoff Becomes a Ranking Factor (Duane Forrester Decodes) https://duaneforresterdecodes.substack.com/p/when-the-training-data-cutoff-becomes [https://duaneforresterdecodes.substack.com/p/when-the-training-data-cutoff-becomes] The companion piece this episode builds on, where I first laid out the parametric-versus-retrieval distinction and what it means for timing. How Perplexity finds and chooses its sources (Search Engine Journal) https://www.searchenginejournal.com/perplexity-ai-interview-explains-how-ai-search-works/565395/ [https://www.searchenginejournal.com/perplexity-ai-interview-explains-how-ai-search-works/565395/] Background on why Perplexity runs a live search on essentially every query rather than answering from memory. Google's AI optimization guidance, and why AI Search is still Search (DemandSphere) https://www.demandsphere.com/blog/google-ai-optimization-guide-ai-search-is-still-search/ [https://www.demandsphere.com/blog/google-ai-optimization-guide-ai-search-is-still-search/] Support for the point that AI Overviews and AI Mode are served off the core Search index, not from Gemini's parametric memory. Claude web search tool documentation (Anthropic) https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool [https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool] Primary source showing Claude's web search runs as a tool the model invokes only when it decides a question needs it. Manage public web access in Microsoft 365 Copilot (Microsoft Learn) https://learn.microsoft.com/en-us/microsoft-365/copilot/manage-public-web-access [https://learn.microsoft.com/en-us/microsoft-365/copilot/manage-public-web-access] The admin control behind the point that, on Copilot, whether retrieval happens at all can be a tenant policy setting. Stop Treating AI Visibility as One Problem (Duane Forrester Decodes) https://duaneforresterdecodes.substack.com/p/stop-treating-ai-visibility-as-one [https://duaneforresterdecodes.substack.com/p/stop-treating-ai-visibility-as-one] The earlier governed-visibility piece this episode zooms into, treating retrieval as one of three layers to manage. ChatGPT search behavior, clickstream insights (Semrush) https://www.semrush.com/blog/chatgpt-search-insights/ [https://www.semrush.com/blog/chatgpt-search-insights/] The study behind the stat that ChatGPT's share of search-triggering sessions swung between roughly 15 and 66 percent as models updated. Lost in the Middle: How Language Models Use Long Contexts (arXiv) https://arxiv.org/abs/2307.03172 [https://arxiv.org/abs/2307.03172] The foundational research on models using long context unevenly, behind the point that being retrieved isn't the same as being used well. How up to date is ChatGPT, and how knowledge cutoffs work (JustDone) https://justdone.com/blog/ai/how-up-to-date-is-chatgpt [https://justdone.com/blog/ai/how-up-to-date-is-chatgpt] Context for the training-cadence point that providers now ship frequent point releases, each carrying its own cutoff. The Machine Layer (Amazon) https://www.amazon.com/Machine-Layer-Visible-Trusted-Search/dp/B0G2WZKM59/ref=sr_1_1 [https://www.amazon.com/Machine-Layer-Visible-Trusted-Search/dp/B0G2WZKM59/ref=sr_1_1] My book, for the longer argument on why visibility, trust, and machine-readability are converging into one problem. Get full access to Duane Forrester Decodes at duaneforresterdecodes.substack.com/subscribe [https://duaneforresterdecodes.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

Kommentarer

0

Vær den første til at kommentere

Tilmeld dig nu og bliv en del af Duane Forrester Decodes-fællesskabet!

Kom i gang

1 måned kun 9 kr.

Derefter 99 kr. / måned · Opsig når som helst.

  • Podcasts kun på Podimo
  • 20 lydbogstimer pr. måned
  • Gratis podcasts

Alle episoder

8 episoder

episode AI Search Runs on Two Memory Systems. The Platforms Don’t Use Them the Same Way. cover

AI Search Runs on Two Memory Systems. The Platforms Don’t Use Them the Same Way.

Referenced in this episode: When the Training Data Cutoff Becomes a Ranking Factor (Duane Forrester Decodes) https://duaneforresterdecodes.substack.com/p/when-the-training-data-cutoff-becomes [https://duaneforresterdecodes.substack.com/p/when-the-training-data-cutoff-becomes] The companion piece this episode builds on, where I first laid out the parametric-versus-retrieval distinction and what it means for timing. How Perplexity finds and chooses its sources (Search Engine Journal) https://www.searchenginejournal.com/perplexity-ai-interview-explains-how-ai-search-works/565395/ [https://www.searchenginejournal.com/perplexity-ai-interview-explains-how-ai-search-works/565395/] Background on why Perplexity runs a live search on essentially every query rather than answering from memory. Google's AI optimization guidance, and why AI Search is still Search (DemandSphere) https://www.demandsphere.com/blog/google-ai-optimization-guide-ai-search-is-still-search/ [https://www.demandsphere.com/blog/google-ai-optimization-guide-ai-search-is-still-search/] Support for the point that AI Overviews and AI Mode are served off the core Search index, not from Gemini's parametric memory. Claude web search tool documentation (Anthropic) https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool [https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool] Primary source showing Claude's web search runs as a tool the model invokes only when it decides a question needs it. Manage public web access in Microsoft 365 Copilot (Microsoft Learn) https://learn.microsoft.com/en-us/microsoft-365/copilot/manage-public-web-access [https://learn.microsoft.com/en-us/microsoft-365/copilot/manage-public-web-access] The admin control behind the point that, on Copilot, whether retrieval happens at all can be a tenant policy setting. Stop Treating AI Visibility as One Problem (Duane Forrester Decodes) https://duaneforresterdecodes.substack.com/p/stop-treating-ai-visibility-as-one [https://duaneforresterdecodes.substack.com/p/stop-treating-ai-visibility-as-one] The earlier governed-visibility piece this episode zooms into, treating retrieval as one of three layers to manage. ChatGPT search behavior, clickstream insights (Semrush) https://www.semrush.com/blog/chatgpt-search-insights/ [https://www.semrush.com/blog/chatgpt-search-insights/] The study behind the stat that ChatGPT's share of search-triggering sessions swung between roughly 15 and 66 percent as models updated. Lost in the Middle: How Language Models Use Long Contexts (arXiv) https://arxiv.org/abs/2307.03172 [https://arxiv.org/abs/2307.03172] The foundational research on models using long context unevenly, behind the point that being retrieved isn't the same as being used well. How up to date is ChatGPT, and how knowledge cutoffs work (JustDone) https://justdone.com/blog/ai/how-up-to-date-is-chatgpt [https://justdone.com/blog/ai/how-up-to-date-is-chatgpt] Context for the training-cadence point that providers now ship frequent point releases, each carrying its own cutoff. The Machine Layer (Amazon) https://www.amazon.com/Machine-Layer-Visible-Trusted-Search/dp/B0G2WZKM59/ref=sr_1_1 [https://www.amazon.com/Machine-Layer-Visible-Trusted-Search/dp/B0G2WZKM59/ref=sr_1_1] My book, for the longer argument on why visibility, trust, and machine-readability are converging into one problem. Get full access to Duane Forrester Decodes at duaneforresterdecodes.substack.com/subscribe [https://duaneforresterdecodes.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

I går14 min
episode You Can Finally Measure Content Alignment. That’s the Dangerous Part. cover

You Can Finally Measure Content Alignment. That’s the Dangerous Part.

References from this episode: When I mentioned Gerard Salton's SMART system at Cornell, the foundational vector space model work from the 1960s, here's the background on that: https://en.wikipedia.org/wiki/SMART_Information_Retrieval_System [https://en.wikipedia.org/wiki/SMART_Information_Retrieval_System] The Netflix study I referenced on cosine similarity producing arbitrary results in embedding models, that's the 2024 paper from Steck, Ekanadham, and Kallus: https://research.netflix.com/publication/is-cosine-similarity-of-embeddings-really-about-similarity [https://research.netflix.com/publication/is-cosine-similarity-of-embeddings-really-about-similarity] The MTEB benchmark leaderboard, where you can see the performance spread across current embedding models: https://huggingface.co/spaces/mteb/leaderboard [https://huggingface.co/spaces/mteb/leaderboard] Goodhart's Law, the "when a measure becomes a target it ceases to be a good measure" concept: https://en.wikipedia.org/wiki/Goodhart%27s_law [https://en.wikipedia.org/wiki/Goodhart%27s_law] The vector index hygiene piece I referenced from last year: https://duaneforresterdecodes.substack.com/p/vector-index-hygiene-a-new-layer [https://duaneforresterdecodes.substack.com/p/vector-index-hygiene-a-new-layer] The written version of this episode is available as the full article on this same Substack. Get full access to Duane Forrester Decodes at duaneforresterdecodes.substack.com/subscribe [https://duaneforresterdecodes.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

31. maj 202619 min
episode You’re Using AI at the Execution Layer. The Value Is in the Judgment Layer. cover

You’re Using AI at the Execution Layer. The Value Is in the Judgment Layer.

The tools are deployed. The licenses are paid. And if you’re a senior SEO or GEO practitioner right now, you’re probably using AI every day - for drafts, for summaries, for first passes at content that used to take twice as long. That’s real productivity, and it’s not nothing. It’s also not the return the investment is capable of producing. And the gap between what you’re getting and what’s available isn’t a tool problem. It’s a mode problem. A peer-reviewed study published at the 2025 ASIS&T Annual Meeting [https://asistdl.onlinelibrary.wiley.com/doi/10.1002/pra2.1253] by Tim Gorichanaz at Drexel University gives that problem a name (h/t to Shari Thurow [https://www.linkedin.com/in/shari-thurow/] for pointing me at this paper!). Analyzing 205 real-world ChatGPT use cases, Gorichanaz identified six distinct modes in which people actually use AI: Writing, Deciding, Identifying, Ideating, Talking, and Critiquing. The data came from Reddit and skews Anglophone, which limits its generalizability, but the taxonomy it produced maps uncomfortably well onto how most practitioners are actually working. Two modes dominate. Four are being left on the table. The four being left are the ones that determine whether AI makes you more strategically valuable or just faster at execution-layer work. That distinction matters more right now than it has at any prior point in this industry’s history. The Two Modes Everyone Defaults To Writing was the largest category in Gorichanaz’s data at 47% of observed use cases - drafting, editing, summarizing, translating, generating. McKinsey’s 2025 State of AI survey [https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai] confirms this at the enterprise level: the most commonly reported AI use cases are content drafting and information capture, and 63% of organizations using generative AI apply it primarily to create text. Identifying - explaining something, answering a factual question, summarizing a document - was another 10% of the study’s data, and represents the other pillar most practitioners have built their AI workflow around. Research a topic, get a synthesis, move to the next task. Together these two modes account for the overwhelming majority of how AI is being used, by practitioners and enterprises alike. Both have real value, yet neither is where the leverage is. And if your AI practice begins and ends there, you’re using an increasingly sophisticated tool to do work that was already being automated - just faster and at higher volume. The other four modes (Deciding at 21% of Gorichanaz’s sample, Ideating at 9%, Talking at 8%, and Critiquing at 6%) are where the work becomes irreplaceable. They’re also where almost no practitioner has built a deliberate workflow, because nobody handed them one, and the pressure to show immediate output has consistently crowded out the space to develop one. The Decisions You’re Still Making Alone In the practitioner’s week, Deciding-mode questions are everywhere: which queries actually have AI visibility exposure worth prioritizing right now, whether a brand’s retrieval problem is a content architecture problem or a sourcing and signal problem, how to allocate effort across a portfolio when both SEO and GEO need attention and the budget doesn’t stretch to cover both fully, when to escalate a visibility concern to leadership versus when to fix it in the work before anyone asks. Most senior practitioners are currently solving these questions with experience and intuition. That’s not a failure as experience and intuition are genuinely valuable, and no AI replaces them. But AI used deliberately in Deciding mode adds something experience can’t provide on its own: a structured pressure-test of the assumptions underneath the decision, applied before the decision hardens. That requires more than a good question. Deciding mode requires giving the AI the relevant context (competitive landscape, current visibility posture, historical performance, strategic constraints) and then treating what comes back as a genuine input to the decision rather than a draft to be skimmed and set aside. It requires a workflow that doesn’t yet exist in most practitioners’ practice, not because anyone blocked it, but because no one built the time or structure for it either. The same McKinsey data makes clear what that gap costs at scale: 88% of organizations use AI, but only 6% qualify as high performers generating meaningful enterprise-wide impact, and high performers are 3.6 times more likely to have fundamentally reworked their workflows rather than simply deployed tools into existing ones. The pattern holds at the practitioner level. Faster output from an unreconstructed workflow is not the same thing as better decisions from a restructured one. The Gaps Nobody Briefed For SEO and GEO practitioners, Ideating mode has a specific application that most are not using and most should be: mapping the entity and authority gaps the brand hasn’t recognized yet. What angles of topical authority has the brand failed to establish that AI retrieval systems are currently filling from other sources? What community signals (forum discussions, aggregated reviews, third-party commentary) are shaping how LLMs represent the brand in response to category queries, and what would it take to shift them? What framings of the brand exist in model training data that the brand’s own content has never addressed or countered? These are genuinely Ideating-mode questions. They’re also questions most practitioners have some version of in the back of their mind without a structured method for surfacing the answers. AI used in Ideating mode, not “give me five content ideas” but a genuine iterative exploration with deliberate constraints and real willingness to follow the output somewhere the team hasn’t already been, is one of the most direct methods available for finding those gaps before a competitor or a client audit finds them first. The barrier isn’t capability. It’s the difference between a Writing prompt with a list output and an actual Ideating session. The first takes two minutes. The second takes twenty, requires a different posture toward the tool, and produces something that can’t be replicated by anyone who didn’t do it. That asymmetry is where practitioner value gets built in the current environment, and most practitioners are not claiming it. The Honest Read Your Team Won’t Give You This is the mode with the most direct application to daily practice and the most organizational resistance, because it requires using AI to find problems in work the practitioner or their team has already invested in. Used properly, Critiquing is how a senior practitioner catches what internal review missed. The weak entity claim in a content strategy that sounds authoritative but isn’t backed by the sourcing AI retrieval systems actually trust. The gap between what the brand says about itself across owned properties and what a well-prompted LLM surfaces when asked a category question the brand should own. The assumed premise in a GEO recommendation that made sense six months ago and is now contradicted by how retrieval patterns have shifted. That last application is not abstract. Running your own brand (or a client’s brand) through a structured AI Critiquing session before the next strategy cycle is exactly the kind of proactive work that separates practitioners operating at the judgment layer from practitioners operating at the production layer. It’s also the kind of work that changes the conversation with a client or a leadership team, because you’re surfacing problems before they become visible in the data rather than explaining them after the fact. The reason Critiquing is underused isn’t a governance problem. It’s a disposition problem. Organizations and practitioners have broadly trained themselves to use AI to produce output, not to interrogate it. Reversing that habit is a choice, and it’s one of the more consequential choices available to a senior practitioner right now. Rehearsal The Talking mode in Gorichanaz’s taxonomy covers AI as a conversation partner, and for practitioners, the most valuable version of that is rehearsal for the internal and client conversations where the stakes are real. The client call where you have to explain why organic traffic is down 30% while AI search visibility is also poor, and you need to hold two separate causal explanations simultaneously without letting them collapse into a single narrative that oversimplifies both. The internal briefing where you have to make the case for GEO investment alongside existing SEO budget to a leadership team that still conflates the two disciplines and wants a single number that explains the ROI of both. The agency or vendor review where you need to push back on a recommended approach without losing the relationship. These conversations are recurring and high-stakes, and most practitioners walk into them with only their own mental rehearsal as preparation. Talking mode (role-playing the pushback, asking the AI to argue the other side, running through the version of the conversation that goes wrong) is not a replacement for experience. It is a preparation method that costs twenty minutes and materially changes the quality of the practitioner who walks into the room. It doesn’t produce an artifact. It doesn’t show up in a utilization report. EY’s 2025 Work Reimagined Survey [https://www.ey.com/en_us/insights/workforce/work-reimagined-survey], which covered 15,000 employees and 1,500 employers across 29 countries, found that 88% of employees use AI at work, but only 5% use it in ways that fundamentally transform what they produce. The reason that gap is so wide is almost certainly that the advanced modes - Critiquing, Deciding, Talking - don’t produce something measurable in the moment. They produce a better practitioner over time, which is a return that compounds and doesn’t appear in a dashboard. What Mode You’re In Is What Layer You’re On The six-mode taxonomy maps almost exactly onto the split between execution-layer work and judgment-layer work. Writing and Identifying are execution-layer modes. They’re valuable, they’re visible, and they’re increasingly the modes that AI handles with less and less human involvement. Deciding, Ideating, Critiquing, and Talking are judgment-layer modes. They’re where the practitioner’s irreplaceability lives. A senior SEO or GEO practitioner who uses AI only in Writing and Identifying mode is, functionally, positioning themselves as an execution-layer worker at exactly the moment when AI is most aggressively compressing that layer. That’s not a prediction about job displacement. It’s an observation about professional differentiation. The practitioners building durable value in this environment are the ones using AI to make their judgment better, not just their output faster. Gorichanaz’s study reframes what information need actually means in the AI era, not just question-answering or uncertainty reduction, but what the authors call skillfully coping in the world, meaning the ongoing application of practical intelligence to situations requiring both understanding and action. For a senior practitioner, that framing is a useful diagnostic. The question isn’t what AI can do. It’s which parts of your work require the kind of practical intelligence that compounds with experience, and whether your current AI practice is making that intelligence sharper or just making everything around it move faster. McKinsey’s workplace research [https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/superagency-in-the-workplace-empowering-people-to-unlock-ais-full-potential-at-work] finds that only 1% of leaders call their companies mature on AI deployment, meaning AI is fully integrated into workflows and driving substantial business outcomes. The practitioner-level version of that gap is just as wide, and just as fixable. If you mapped your actual AI usage against the six modes this week (not what you intend to do, what you actually did) how would the distribution look? How much was Writing and Identifying? How much was Deciding, Ideating, Critiquing, Talking? The practitioners who close that gap deliberately, who build even a minimal workflow around the judgment-layer modes, are not doing something exotic. They’re doing something most of their peers are not. In a discipline where the execution layer is getting compressed by the same tools everyone has access to, that gap is the one worth closing first. If this framing connects to work you’re navigating, I’d like to hear about what you’re seeing. And if you want to go deeper on the structural layer beneath all of this, The Machine Layer [https://www.amazon.com/Machine-Layer-Visible-Trusted-Search/dp/B0G2WZKM59/ref=sr_1_1] is where that conversation continues. And before you go, just a heads up that I have a special announcement coming on Tuesday this week, just 2 days from now! You’ll get an extra email and podcast this week. Thanks for your time, everyone, and I’ll be back soon. Thanks for reading! This post is public so feel free to share it. Get full access to Duane Forrester Decodes at duaneforresterdecodes.substack.com/subscribe [https://duaneforresterdecodes.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

24. maj 202617 min
episode LLM Guidance Doesn’t Port the Way SEO Guidance Did cover

LLM Guidance Doesn’t Port the Way SEO Guidance Did

For roughly two decades, the SEO discipline operated on a quiet assumption that turned out to be one of its most valuable features. Guidance from one search engine traveled. If Google said sitemaps mattered, Bing said sitemaps mattered. If Bing said structured data deserved real effort, Google said the same. Practitioners optimized for Google with reasonable confidence that the work would carry across the other engines, and most of the time it did. That portability was not luck. It was the product of a structurally large overlap layer that the major search engines had jointly built, brick by brick, over twenty years. That world doesn’t exist in LLM-land. The major providers train on different corpora, run different crawlers under different policies, route different queries through different retrieval systems, and apply different alignment processes that shape the final response in ways the upstream signals can’t predict. Guidance from any one provider, including Google’s guidance about its own Gemini products, is one data point. Practitioners carrying the SEO habit forward, the habit of treating one engine’s guidance as roughly the whole map, will optimize confidently for one platform and miss the others. Sidebar: As I was finalizing this piece, Google published fresh guidance on optimizing for their generative AI features [https://developers.google.com/search/docs/fundamentals/ai-optimization-guide]. Their framing is explicit: from Google Search's perspective, optimizing for AI search is still SEO. That framing is accurate for Google Search. It does not extend to ChatGPT, Claude, Perplexity, or any other LLM, and that is precisely the trap this article is about. The shared standards that made SEO guidance portable The era of portable guidance was built on actual collaboration, not coincidence. The Sitemaps protocol [https://www.sitemaps.org/] became the joint property of Google, Yahoo, and Microsoft in November 2006, when the three engines formally agreed to support a common protocol at version 0.90, building on Google’s earlier Sitemaps 0.84 from June 2005. Five years later, on June 2, 2011, the same three engines launched Schema.org [https://blogs.bing.com/search/June-2011/Introducing-Schema-org-Bing,-Google-and-Yahoo-Uni], with Yandex joining shortly after, to create a common vocabulary for structured data markup. That was the announcement that got made on stage at SMX Advanced. I was on the Bing team at the time, and what struck me then is what still matters now. The engines were competitors, but they had decided that a shared vocabulary served them all. Webmasters got one set of rules. The web got cleaner data. The engines got better signals. Everybody won. The pattern repeated with robots.txt [https://www.rfc-editor.org/rfc/rfc9309.html], the 1994 convention that became RFC 9309 at the IETF in 2022, formalizing what every serious crawler already honored. And it repeated again, more recently, with IndexNow [https://www.indexnow.org/], the protocol Microsoft Bing and Yandex launched in October 2021. IndexNow is now supported by Bing, Yandex, Naver, Seznam, and Yep. Google has tested the protocol since 2021 but has not adopted it. That overlap layer is exactly why Google’s guidance felt safe to follow even if you cared about Bing traffic. The signals the engines used were not identical, but the inputs they accepted, the protocols they honored, and the standards they advertised were. Optimization had a shared substrate. Where the LLM stacks actually diverge The LLM environment doesn’t have a shared substrate of comparable size. The differences are not cosmetic and they are not temporary. They are baked into how the systems are built. Start with training data. OpenAI has signed disclosed licensing deals with News Corp worth up to $250 million over five years [https://everything-pr.com/ai-licensing-tracker/], Axel Springer [https://openai.com/index/axel-springer-partnership/] at roughly $13 million per year, Reddit [https://www.cjr.org/analysis/reddit-winning-ai-licensing-deals-openai-google-gemini-answers-rsl.php] at an estimated $70 million per year, plus the Financial Times, Condé Nast, Hearst, Vox Media, The Atlantic, the Associated Press, Le Monde, and others. Google has its own Reddit deal estimated at $60 million per year granting real-time data API access. Anthropic has not publicly disclosed equivalent publisher licensing deals, and that undisclosed status is itself the practitioner-facing point. The corpora that fed these models, and that continue to refresh them, are not the same documents. Practitioners cannot know what any given provider has paid for and what it hasn’t. The crawler infrastructure diverges next. OpenAI runs three separate bots [https://platform.openai.com/docs/bots]: GPTBot for training, OAI-SearchBot for search indexing, and ChatGPT-User for user-initiated retrieval. Anthropic runs three of its own [https://support.anthropic.com/en/articles/8896518-does-anthropic-crawl-data-from-the-web-and-how-can-site-owners-block-the-crawler]: ClaudeBot for training, Claude-SearchBot for search, and Claude-User for user-initiated retrieval. Perplexity runs PerplexityBot and Perplexity-User. Google introduced Google-Extended in September 2023 as the user-agent that controls whether Google can use a site’s content to train Gemini, separate entirely from the Googlebot that handles traditional search indexing. There is no single AI user-agent. Every provider requires a separate rule, and the rules don’t translate cleanly across providers because the bots don’t do equivalent jobs in equivalent ways. The retrieval architectures diverge structurally. ChatGPT has historically used Bing’s index [https://yoast.com/chatgpt-search/] as its primary web search source, and that connection appears to still be primary, though OpenAI continues to build out additional infrastructure alongside it. Perplexity built its retrieval system on a Vespa-based pipeline that treats documents and sub-document chunks as first-class retrievable units. Google’s Gemini uses Google’s own index plus Knowledge Graph grounding. Claude uses Brave Search as a retrieval partner. Same query, four different retrieval systems, four different views of which sources exist and which sources are worth surfacing. Then comes the alignment layer, which is where SEO had no equivalent at all. After a model is trained on its corpus, providers run post-training to shape how the model actually behaves: tone, refusal patterns, format, safety posture, what counts as a good answer. OpenAI’s primary approach has been RLHF, or Reinforcement Learning from Human Feedback [https://arxiv.org/abs/2203.02155], where human raters score model outputs and the model learns to produce highly rated responses. Anthropic developed Constitutional AI, which trains models to critique and revise their own outputs against a written set of principles. These methodologies produce demonstrably different behavior in the final products. The same retrieved content, fed into two models aligned by two methodologies, can yield two materially different responses about the same brand. When one provider’s guidance demonstrably fails to port The clearest single example of guidance that doesn’t port is llms.txt [https://llmstxt.org/]. Jeremy Howard of Answer.AI proposed the file in September 2024 as a markdown manifest, placed at a site’s root, that would guide LLMs to the most important content. The proposal got picked up across the SEO community. Yoast built a generator. Agencies added llms.txt creation to their service catalogs. Conference speakers declared it essential. As of mid-2026, no major LLM provider has confirmed they consume the file [https://ahrefs.com/blog/what-is-llms-txt/]. Not OpenAI. Not Anthropic. Not Google. Server-log analyses across hundreds of thousands of domains show major AI crawlers don’t routinely request /llms.txt at all. Google’s John Mueller publicly compared it to the deprecated meta keywords tag [https://www.cshel.com/ai-seo/no-llms-txt-is-not-the-new-meta-keywords/]. Gary Illyes confirmed at Search Central Live in July 2025 that Google does not support llms.txt and is not planning to. I’ve written about this elsewhere [https://duaneforresterdecodes.substack.com/p/llmstxt-the-webs-next-great-idea], so I won’t repeat the technicalities here. What matters for this argument is the structural lesson. Schema.org succeeded because three engines built it together and then enforced it together. llms.txt was proposed by one researcher, picked up by tooling vendors, and ignored by the platforms it was supposed to serve. The shared-standards model that gave SEO its portable guidance is not available to LLM practitioners at the same scale, because the platforms are not building the standards together. They are building their own pipelines. The Gemini inversion The cleanest illustration of how far guidance portability has degraded sits inside one company. Google publishes its own SEO documentation [https://developers.google.com/search/docs] at Search Central, the canonical guidance the industry has followed for two decades. Those documents emphasize traditional ranking signals, E-E-A-T, content quality, technical accessibility, and structured data. That guidance is still useful for Google Search itself. Google also makes Gemini, the model that powers AI Overviews and Google’s separate AI Mode surface. And the citation behavior of those surfaces does not appear to track the guidance the same company publishes for its own search results. In late 2024, roughly three-quarters of pages cited in AI Overviews [https://ahrefs.com/blog/ai-overview-citations-top-10/] also ranked in Google’s top 12 for the same query. By early 2026, after Google upgraded AI Overviews to Gemini 3 in January, Ahrefs analyzed 4 million AI Overview URLs and found that only 38% of cited pages also appeared in the top 10 for the same query. A separate BrightEdge analysis [https://www.searchenginejournal.com/google-ai-overview-citations-from-top-ranking-pages-drop-sharply/568637/] put the overlap closer to 17%. SE Ranking’s post-upgrade work found that Gemini 3 replaced approximately 42% of the domains previously cited under earlier model versions and generates 32% more sources per response. The gap widens further when you look at Google’s AI Mode, which is a separate conversational surface that runs on the same Gemini family. SEMRush data [https://whitehat-seo.co.uk/blog/ai-engines-comparison-citations] shows AI Mode and AI Overviews reach semantically similar conclusions 86% of the time, but cite the same URLs only 13.7% of the time. Only 14% of AI Mode citations rank in Google’s traditional top 10. It appears, so far, that the canonical relationship has shifted. Google’s published SEO guidance is still the cleanest path to ranking in Google Search. But that ranking is no longer a reliable proxy for being cited by Google’s own AI surfaces. The same guidance, the same content, the same domain, can produce three meaningfully different outcomes across Google Search, AI Overviews, and AI Mode, even though all three live inside the same company. The old playbook of following the search engine’s guidance and trusting that the engine’s other surfaces would behave consistently does not appear to be delivering the same returns it used to. What still ports, and why it’s smaller than it looks A universal layer does survive. Crawler accessibility still matters across every provider. Primary-source factual content still wins more citations than aggregator restatement. Clean retrievable structure still helps every system understand what a page is about. Presence on the high-authority sources that all major LLMs disproportionately cite, Wikipedia, YouTube, Reddit, major news outlets, still functions as a force multiplier across platforms. Earning visibility on those sources gives content a chance to surface in any LLM that draws on them. But the universal layer is much smaller than it was in the SEO era. Qwairy’s analysis of 118,000 AI responses [https://whitehat-seo.co.uk/blog/ai-engines-comparison-citations] across ChatGPT, Perplexity, Google AI Mode, and Claude found that only 11% of cited domains appeared across multiple platforms. The other 89% were platform-specific. A brand that wins citations on Perplexity may be largely invisible on Claude. A brand that’s a regular reference on ChatGPT may not show up in AI Overviews at all. The same content can be the right answer for one system and the wrong answer for the system next to it. What this means for the work The practical implication is not abandoning all hope. It is that practitioners need to stop treating any single LLM provider’s guidance as the universal map and start treating it as one input among several. Read what every major provider publishes about their own systems. Test your visibility across platforms, not just on the platform you happen to use most. Treat divergence as the default and overlap as the exception, not the other way around. This is not how SEO worked, and the difference matters. The old reflex was to optimize for Google and trust the portability. The new reality is that following one LLM’s guidance, even Google’s guidance about Gemini, will leave you optimized for a slice of the landscape and potentially blind to the rest. The discipline is being rebuilt on platform-specific work that didn’t exist in the SEO era, and the practitioners who recognize that first are going to spend the next two years setting the standards everyone else follows. The overlap shrunk. You now have more work than ever to accomplish. If you have thoughts on where the divergence between providers is sharpest in your own work, drop a comment below or reach out directly. I’d genuinely like to hear what’s showing up in the data. For more on how the AI search environment is reshaping the practitioner discipline, The Machine Layer [https://www.amazon.com/Machine-Layer-Visible-Trusted-Search/dp/B0G2WZKM59/ref=sr_1_1] is on Amazon. Thanks for reading! This post is public so feel free to share it. Get full access to Duane Forrester Decodes at duaneforresterdecodes.substack.com/subscribe [https://duaneforresterdecodes.substack.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4]

17. maj 202617 min