AI Platforms

Claude doesn’t browse: if you’re not in its training, to it you don’t exist

Roberto Serra 25 June 2026·~8 min read

Claude — one of the leading AI assistants used in business — doesn't browse the web when it answers: it only talks about what it read during training. If your brand wasn't already present on stable sources before that training, to Claude you simply don't exist — no matter how up to date your site is today. Every query from a prospect using Claude is a conversation in which your name never appears, while competitors who built a historical presence get cited in your place. Knowing where to build that trail completely changes the game.

Open Claude.ai, turn on web search, ask about your industry. Now count: how many sources include you in its report? Zero? We have a problem.

Because if in its base version Claude doesn’t browse the web — and that’s the case on the free plans and in much of the API integrations — the only thing it can draw on to talk about your brand is its training data. If you’re not in the corpus Anthropic trained the model on, to Claude you simply don’t exist. Period.

In my articles on ChatGPT and Perplexity I explained how live retrieval works, the schema, freshness. With Claude the rule changes: what counts is historical presence on the web, not freshness. Let me explain why this flips the strategy for anyone who wants to show up in AI answers.

How Claude builds its answers when it doesn’t browse

In the world of large language models there’s a clear distinction: models with live retrieval (which read the web at the moment of the question) and models that answer from memory alone, that is, from the weights learned during training.

Claude, in its base form, belongs to the second category. Anthropic added web search as an optional feature, but it isn’t on by default across all plans and isn’t always present in the API integrations companies use to build their internal assistants.

When Claude answers without browsing, it’s performing a different operation than Perplexity. It doesn’t search, doesn’t retrieve, doesn’t compare current sources. It’s reconstructing an answer using what it absorbed during training: website texts, Wikipedia, academic publications, books, technical forums, open documentation.

Translated: if your brand wasn’t in the corpus at the time of training, Claude can’t cite you. Not because it doesn’t want to cite you — because it doesn’t know you.

Why this mechanism changes the strategy compared to ChatGPT and Perplexity

In the previous articles I told you how ChatGPT and Perplexity largely work through retrieval-augmented generation: they search the web, grab the pages, read them, build the answer. In that scenario what counts is content freshness, clean schema markup, indexing speed.

With Claude without browsing, the game changes. Your site might have perfect schema markup and an exemplary inverted-pyramid structure: if the training data never saw you, Claude doesn’t cite you. And vice versa: a competitor with a technically mediocre site but present on Wikipedia, with mentions in historical industry publications, will get cited.

From this follows a strong operational consequence: for Claude, historical presence and density of citations on stable sources matter more than recent technical optimizations. What counts is the signal you manage to leave on the web so that it gets picked up in the large training corpora.

Common mistake

Confusing SEO traffic with presence in the corpora.

The reverse engineering test you can run in 15 minutes

Let me walk you through the test I use to figure out whether a brand is “inside” Claude or not. It’s called reverse engineering Claude’s web search: even when Claude browses, the model tends to suggest and confirm what it already knows from training. If you’re not in the training, even web search struggles to surface you with authority.

Here are the steps.

Open Claude.ai, turn on web search.
Ask a neutral question about your industry (not your brand). Example for a producer of nduja and spicy cured meats in Vibo Valentia: “Which are the artisan producers of Calabrian nduja using traditional methods?”
Read the answer. Note the brands cited in the body of the text and the sources listed at the bottom.
Then ask explicitly: “Is there an nduja producer in Vibo Valentia called [your brand name]?”
Observe: does Claude confirm with specific details (location, process, products), give you a generic answer, or say it has no information?

If it answers with precise details → you’re in the training or in reliable sources that web search found quickly. If it answers vaguely or negatively → you’re not there, or you’re there too faintly.

This is an indicative test, not a study. The sample is small and Claude’s answer varies between sessions. But it’s a first step: real analysis requires professional tools and a series of queries repeated over time.

Pro tip

Open or complete the entry on Wikidata: the brand entity with properties (location, founding, product category).

The test I ran on the Calabrian nduja industry

Let me tell you about a recent reverse engineering exercise. I took the industry of nduja and spicy cured meat producers in Vibo Valentia and Spilinga (which is the geographic heart of nduja DOP) and ran 12 different queries on Claude with web search on. Questions like: “best artisan nduja producers in Vibo Valentia”, “historic Calabrian spicy cured meat companies”, “where to buy authentic Spilinga nduja online”.

Across 12 answers, 4 brands recurred in at least 8 out of 12 queries. They were the same four, every time. Another 5-6 producers showed up occasionally (2-3 times out of 12). The rest of the market — dozens of real companies with active e-commerce, agricultural VAT numbers, presence on Google Maps — never appeared.

I checked the four recurring brands. What did they have in common? A dedicated Wikipedia page or a citation on category Wikipedia pages (nduja, Calabrian cured meats). Mentions on Gambero Rosso, Slow Food, trade magazines with a public archive. Historical citations on Italian food blogs more than 10 years old.

The others, even those with excellent e-commerce sites and clean schema markup, weren’t there. Small sample, clear pattern: for Claude, the longevity of textual presence on the web counts more than current technical optimization.

The mistakes I see most often among those who want to get into Claude’s training

Four patterns I see recur in practically every Italian B2C industry.

Betting only on an up-to-date company website. A freshly rebuilt site, however technically perfect, probably wasn’t in the model’s training. The “existence for Claude” signal is built on stable third-party sources, not on your domain.

Ignoring Wikipedia and Wikidata. These two are in the training of practically all commercial AI models. If your brand or your industry has no entries there (or isn’t linked from entries there), you’re leaving the strongest signal on the table. From Wikidata comes half of the models’ structured knowledge, as I explained in the article on the Google Knowledge Graph.

Confusing SEO traffic with presence in the corpora. Getting 50,000 visits a month from Google doesn’t mean being in Claude’s training. Anthropic’s crawlers and aggregators like Common Crawl collect according to their own logic. A niche blog with 2,000 visits a month but cited by Wikipedia weighs more than your e-commerce with 100k visits.

Not tending to “implicit” citations. The principle of implicit reference weight counts double for Claude: every mention of your brand in third-party texts, even without a link, builds your presence. Just as backlinks work as a citation proxy for the models.

What to do concretely to be inside the training

It’s not a magic recipe and it isn’t enough on its own. But these actions move the needle.

Build or improve presence on Wikipedia: a brand entry if notably relevant, otherwise mentions in category entries (for nduja: the entries “Nduja”, “Calabrian cuisine”, “Spilinga”).
Open or complete the entry on Wikidata: the brand entity with properties (location, founding, product category). It’s the entity backbone for Claude.
Publish or get cited on trade publications with a long-lasting public archive (Gambero Rosso, Dissapore, Slow Food, archived local magazines). A public archive means Common Crawl and similar can read them.
Tend to the author’s reputation as an entity: if the founder is a public figure with a history, their archived interviews also speak about the brand.
Take part in industry events with published proceedings or press coverage — the principle of event entity speaking authority multiplies mentions on stable sources.

None of these is fast. The training of Claude 3.5 or later versions has a cutoff at a certain date: what you publish today will enter the training only in future versions of the model, on the order of 12-24 months. For this reason, unlike Perplexity where freshness has immediate value, with Claude you’re investing in the medium term.

Where we go from here in the AI platforms series

Showing up in AI answers isn’t a single game: every AI engine has different rules. Claude favors the deep memory of training, ChatGPT mixes memory and retrieval, Perplexity lives on pure retrieval, Gemini weighs Google’s graph. Understanding the difference lets you allocate budget where it’s needed: archived content for Claude, schema and freshness for Perplexity, Knowledge Graph for Gemini.

In the next articles of the AI platforms series we look in detail at how the retrieval strategy changes between ChatGPT and Perplexity, and how Gemini uses Search telemetry. The thread is always the same: understand the mechanism to decide where to invest to show up in AI answers.