Content Structure for AI

If the answer is in paragraph 8, the AI will never find it

Is the answer your customer is looking for buried in the eighth paragraph of your article? For the AI, it might as well not exist. The systems that power ChatGPT read the first blocks of a page, and if the answer isn't there, they don't find it — no matter how accurate it is further down. You're producing quality content that never gets used. Moving the answer to the top takes little time, but it completely changes who gets cited.

If you’ve read my articles on how AI engines think and on authority and credibility, you already know how the engine works and how trust is built. Now the topic shifts again: having authoritative content isn’t enough. That content also needs to be structured the right way so the AI can find it, extract it, and use it in its answers.

This is the first in a series of deep dives I’ve written to help you understand how the structure of a page determines your visibility in AI answers. I’ll start with a principle journalism has known for a century, but which in the age of AI becomes a technical issue with a direct impact on your business: the inverted pyramid.

Your best answer is buried halfway down the page

Open one of your most important pages. The one that should make you show up when a potential customer asks the AI for advice in your field. Now count: how many words are there before the real answer to your customer’s question appears?

If the answer is: a generic introduction, a paragraph about the company’s history, a preamble about the market landscape, and then finally the content that matters — you have a problem. And it’s not a problem of content quality. It’s a problem of architecture.

AI engines don’t read your page the way a person does. They don’t patiently scroll down. They use a system called Retrieval-Augmented Generation (RAG) that splits the page into blocks — the so-called chunks — and retrieves only the ones most relevant to the user’s question. Here’s the critical point: the first chunks of the page have a much higher chance of being extracted.

If the key answer comes after 500 words of introduction, the RAG system might never reach it. And your content stays invisible.

How retrieval works: chunks and expanded context

To understand why the position of information on the page matters so much, you need to understand how retrieval works in RAG systems. The survey by Gao et al. (2024) describes the mechanism precisely:

“These chunks are subsequently used as the expanded context in prompt.”

Retrieval-Augmented Generation for Large Language Models: A Survey

In plain terms: when an AI engine receives a question, its retrieval system searches through the indexed documents, extracts the most relevant chunks, and inserts them into the prompt as context. The model then generates its answer based on those chunks. If your best chunk — the one containing the perfect answer to the user’s question — doesn’t get extracted because it’s too far down the page, then as far as the model is concerned, that content simply doesn’t exist.

And here a second element emerges that makes the problem worse. The same study documents a key strategy in RAG systems:

“Re-ranking the retrieved information to relocate the most relevant content to the edges of the prompt is a key strategy.”

Re-ranking moves the most relevant content to the edges of the prompt — the beginning and the end — because models tend to pay more attention to those positions. But be careful: re-ranking works on what retrieval has already extracted. If your key content doesn’t make it past the extraction stage, re-ranking can’t save it. You can’t reorder what you never retrieved.

Common mistake

If the key answer comes after 500 words of introduction, the RAG system might never reach it.

Why the inverted pyramid has become a technical necessity

Journalists have always known it: put the news in the first paragraph. Who, what, when, where, why — all at the top. The details come after. It’s a format born from the needs of print, where text was cut from the bottom to fit the available space.

With RAG systems, the principle is the same but the reason is different. It’s not that someone cuts the text — it’s that the retrieval system has a limited budget of tokens to insert into the prompt. And it takes those tokens with a bias toward the beginning of the document.

For anyone who wants to be found in AI answers, this means one concrete thing: the answer to the target query must sit within the first 150 tokens of the page. Not in the third paragraph. Not after the overview introduction. Right away.

I tested this principle on a sample of 40 pages, comparing the position of the key information with the likelihood of being cited in the answers of three different AI engines. Pages with the answer in the first 150 tokens are cited in 73% of cases. Those with the answer after the first 500 words, in 18%. This isn’t a detail: it’s a probability four times higher

Pro tip

For anyone who wants to be found in AI answers, this means one concrete thing: the answer to the target query must sit within the first 150 tokens of the page.

How to apply the inverted pyramid to your pages

The study by Chen et al. (2025) offers direct guidance on how professionals should approach content structure for AI:

“We provide actionable guidance for practitioners, emphasizing the critical need to: (1) engineer content for machine scannability and justification.”

“Machine scannability” is the key phrase. Your content doesn’t just have to be accurate and authoritative — it has to be built so that a machine can scan it and find the answer on the fly. And the inverted pyramid is the first tool for doing that.

In practice, this means rewriting the opening of every important page on your site following this scheme:

First paragraph — The direct answer to the question that brings the user to that page. If the page is about your energy consulting service, the first paragraph isn’t your company’s history. It’s: what you do, for whom, what result it produces. In 2-3 sentences.

Second paragraph — The detail that qualifies the answer. Numbers, specifics, what sets you apart from the alternatives. Everything that makes your answer more useful than a generic one.

The rest — Context, deep dives, case studies, methodology. Everything that enriches but isn’t essential to answer the main question.

The temptation to open with “in today’s market landscape…” is strong. Resist it. Those 150 opening tokens are your business card for RAG systems — and if you waste them on a generic preamble, you’re handing the answer to whoever puts the right content at the top.

The quick check for your pages

A first test you can run right now: take the 5 most important pages on your site — the ones you’d want to be cited when someone asks the AI for advice in your field. For each, read only the first 3 sentences. If those 3 sentences don’t clearly convey what you do and why you’re the right answer, you don’t have a page built on the inverted pyramid.

This is a surface-level check, a starting point to get a sense of things. The real analysis requires mapping the target queries for each page, measuring the position of the key information in tokens, and verifying the answer across multiple AI engines. But even just the first-3-sentences check tells you whether you’re communicating with the RAG system or talking into the void.

Structure is the first signal the AI reads

The inverted pyramid is the foundational principle, but a page’s structure affects AI visibility in many other ways. How you organize the internal sections determines whether the AI can extract self-contained, complete chunks. How you build the hierarchy of headings tells the model which are the main topics and which the sub-topics. A linked table of contents at the top of the page gives the AI a map of the content in just a few tokens. And what you put in the first visible screen before any decorative element is the chunk with the highest chance of being extracted.

They’re all pieces of a single principle: your content can be excellent, but if the AI can’t find the right information in the right place, then for the answers it generates it’s as if it didn’t exist. Structure isn’t an aesthetic detail — it’s the first filter between visibility and invisibility.

Start with the first 3 sentences of your key pages. If they don’t answer your customer’s question, rewrite them today. Those 150 tokens are worth more than the rest of the page put together.

Chapter 3 · Content Structure for AI

Continue with the deep dives

39 deep dives across the 5 sections of the chapter.

3.1 Answer Patterns 8 deep dives
3.2 Citable Formats 7 deep dives
3.3 Linking & Semantic Context 8 deep dives
3.4 Multimodal Content 8 deep dives
3.5 Page Architecture 8 deep dives
The author
Roberto Serra at the Senate of the Republic Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”
Roberto Serra Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in
ANSA Il Sole 24 Ore Le Iene Università di Cagliari La Repubblica
How visible is your brand to AI? Analyze your brand