How AI engines think

How AI Decides Which Words Matter Most on Your Page

Roberto Serra 25 June 2026·~7 min read

Your company name appears everywhere on the site, but your industry keywords show up far away from it, in different contexts or on separate pages. For the AI, the association between your name and what you do is weak — and when it has to recommend an expert in your field, it picks whoever built that association more explicitly. You are losing the proximity between who you are and what you know how to do. Fixing this problem on the content you already have can radically change how the AI positions you in your industry.

Your content has 2,000 words. The AI does not weigh them all the same way. Some receive a very high weight, others are practically ignored. The mechanism that decides this weight is called attention, and it is at the heart of how AI models select the relevant information to surface in their answers.

If you understand how it works — and it is not complicated — you can build content where your brand systematically receives more weight. If you ignore it, I am sorry to say, you are leaving the probability of being cited up to chance.

The attention mechanism: how AI weighs words

The Transformer — the architecture that GPT-4, Claude and Gemini run on — has a central component that in the research world is called self-attention. The concept is surprisingly intuitive.

The survey by Minaee et al. (2025) explains it well:

“By applying self-attention to compute in parallel for every word in a sentence or document an ‘attention score’ to model the influence each word has on another, Transformers allow for much more parallelization than RNNs, which makes it possible to efficiently pre-train very big language models on large amounts of data.”
(Large Language Models: A Survey)

In practice it works like this: for every word in the text, the model computes a score relative to all the other words. “How relevant is word A to understanding the meaning of word B?” Word pairs with a strong semantic relationship receive high scores and influence each other. Words in irrelevant contexts receive low scores — for the model it is as if they were not there.

In the same survey, the authors put it even more directly:

“The heart of Transformer is the (self-)attention mechanism, which can capture long-term contextual information much more effectively than the recurrence and convolution mechanisms.”
(Large Language Models: A Survey)

“Long-term contextual information” is the key. The attention mechanism does not only look at nearby words — it looks at the relationships between words even at a distance. If your brand and an industry term appear on the same page, even paragraphs apart, self-attention computes their mutual score.

Hence the deduction: strong co-occurrences = more weight for your brand

This is an important point and I want to be transparent: what follows is a logical deduction from the documented mechanism, not a fact proven by a specific experiment on brand visibility.

The reasoning is this. If the attention mechanism assigns higher scores to word pairs with a strong semantic relationship, and if your brand systematically appears alongside the key terms of your industry — across different pages, in different contexts, on different sources — then the model builds a dense association. Every time the AI encounters those industry terms in a user’s question, the weight associated with your brand is higher.

If instead your brand appears in generic, isolated contexts or scattered across too many topics, the attention signal is weak. The AI builds no useful association, and when it has to answer a question in your field, it does not put you forward.

From this follows an operational rule: the density of the brand + industry-term co-occurrence matters more than the volume of content.

Common mistake

Every post that associates the brand with an out-of-industry context — company events, holiday greetings, personal reflections — dilutes the attention signal.

A test on a real niche

I analyzed two competing companies in the industrial fresh-pasta machinery sector — let’s call them PastaLine and PastaItalia — and put a battery of 40 industry-related queries, reworded in different ways, to the main AI engines (ChatGPT, Perplexity, Gemini).

PastaLine had 25 pages where the brand always appeared alongside “pasta factory machines”, “pasta extruders”, “fresh pasta production lines”, “industrial dough mixer”. Each page created a brand + technical-term co-occurrence. On the IPACK-IMA trade fair site, PastaLine was listed in the “pasta machinery” category. In an industry magazine, an article cited “PastaLine among the leading producers of extruders”.

PastaItalia — which actually had more pages in total — used its site to talk about recipes, trade fairs, sustainability, company news, Christmas greetings too. The brand co-occurred with “pasta” but just as much with “sustainability”, “team building”, “events”. The association was scattered across too many contexts.

Result over 40 queries: PastaLine was cited in 65% of the answers, PastaItalia in 15%. A single query to ChatGPT proves nothing — the models have a stochastic component and every answer can vary. But across a large sample the pattern becomes clear, and in this case it was stark.

The difference was not in the volume of content — it was in the density with which the brand co-occurred with the specific terms of the industry.

Pro tip

Map the 10-15 key terms of your industry: the ones a customer would use to describe the problem you solve.

Attention is multi-head: more dimensions, more opportunities

A technical detail with practical implications. Attention in the Transformer is not a single computation — it is multi-head, meaning it is run in parallel across different dimensions. Each attention “head” captures a different type of relationship: one head might capture syntactic relationships, another semantic relationships, yet another thematic relationships.

This means a single dimension of co-occurrence is not enough. If your brand co-occurs with industry terms only in a certain type of context (for example only on commercial pages), the model sees that association in a single form. If it co-occurs in editorial, technical and third-party contexts too, the same association reaches it from different angles, and the overall signal is more robust.

It follows that the co-occurrence strategy must extend beyond your own site: guest posts, press releases, industry directories, citations in technical articles. Each different context activates different attention heads.

The mistakes I see most often

The blog that talks about everything. Every post that associates the brand with an out-of-industry context — company events, greetings, personal reflections — dilutes the attention signal. The blog should reinforce the brand-industry association, not disperse it. A post about team building is useful for employer branding, but for the AI it is noise that weakens the association with your core business.

Directories without context. Being listed on Yellow Pages without an industry description is a mention without context. The brand appears, but alongside what? Nothing specific. For the attention mechanism it is a data point that contributes to no strong association.

The brand only in the logo. The crawlers that feed the AI engines extract mainly text. If your brand appears only in the logo and never in the body text of the paragraphs, for the attention mechanism it does not exist in relation to the content of the page.

What to do concretely

Map the 10-15 key terms of your industry: the ones a customer would use to describe the problem you solve. Not generic keywords — terms specific to your niche.
Check the co-occurrence: on each page where your brand appears, how many of those key terms appear in the same paragraph or the same section? If the brand is in one paragraph and the key terms in another, the co-occurrence is weak because the attention score between them will be low.
Build co-occurrence clusters: every page of your site should create a dense semantic context. Brand + key term 1 + key term 2 in the same section. Not scattered across the page — close together, in the same block of text.
Extend beyond the site: guest posts, press releases, social media bios, citations in industry directories — they all must associate your brand with the key terms. The AI does not read only your site, and each different context activates different dimensions of attention.

Attention in the AI visibility chain

Attention is the third link in the chain. Tokenization decides whether your brand is recognized as an entity. The positional encoding decides whether it is “seen” based on where it sits in the text. Attention decides how much weight it receives relative to all the other words in the context. And the context window determines how many words the model can consider in total.

If your brand is well tokenized, well positioned on the page, but does not co-occur with the terms of your industry, the attention mechanism assigns it a low weight. It is like being in the right room but not speaking the language of the others — nobody notices you.

Chapter 1 · How AI engines think

Continue with the deep dives

38 deep dives across the 5 sections of the chapter.

1.1 AI Reasoning 8 deep dives

Step-by-step guides: why AI loves them (and how to write them) AI Agents and APIs: Your Business Can Become a Service the AI Calls Is AI inventing things about your brand? It happens when it can’t find reliable data Cover the Whole Workflow or the AI Ignores You (and Picks Another Source) Whoever Gets Cited in ChatGPT’s First Turn Has an Edge Over Everyone Else If the AI says ‘might’ when talking about you, you have a trust problem If your brand info contradicts itself, AI picks a competitor ‘Recommend the best X in Y’: does your content match this query?

1.2 Evaluation & Scoring 8 deep dives

Writing Too Complex? AI Struggles More to Use Your Content How to Become the Brand AI Generates Automatically for Your Industry Want AI to rephrase you? Write the answer exactly as you want it Exaggerated data on your site? AI discards it and picks whoever is more honest Your title says one thing, your content another? AI notices and penalizes you Logical gaps and contradictions? AI lowers your content’s score Who Is Your Brand Cited With? This Determines Your AI Category Are you rewriting what everyone else has written? AI wants novelty

1.3 LLM Architecture 8 deep dives

AI Replies With Outdated Data About Your Brand? Here’s Why It Happens Is your brand invisible to ChatGPT? The problem starts with how it reads it AI reads your page like a book: it skips the middle How AI Decides Which Words Matter Most on Your Page You are here If your page is too long, the AI cuts it and loses you Why ChatGPT Always Recommends the Same Brands (and How to Get on the List) The semantic distance between you and your customer decides whether AI finds you For AI, your page structure matters more than length

1.4 Retrieval & Grounding 7 deep dives

Perplexity and Bing Chat search in real time: are you in their index? Exact keywords or synonyms? AI needs both (here’s why) AI doesn’t read your whole page — it slices it into chunks After retrieval comes reranking: this is where generic content loses Want AI to cite your site by name and with a link? Here’s what you need to give it AI rewrites the question before searching: is your content ready? AI combines multiple sources to answer: are you in at least 2 of them?

1.5 Training & Alignment 7 deep dives

Useful, accurate and safe: the 3 criteria AI uses to judge your content The AI’s Internal Filters Can Block Your Site Without Warning Is your industry underrepresented in the training data? AI already starts at a disadvantage Vertical AI models: if you’re not in their data, you don’t exist in their world Copied content? The AI keeps the original and discards yours The perfect answer according to AI: structured, specific, with sources Aggressive SEO in 2026? AI Safety Filters Are Already Penalizing You

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →