Content Structure for AI

Your videos have no chapters? The AI can’t cite the right part

Roberto Serra 25 June 2026·~8 min read

Do you have long videos without chapters? To the AI they are indistinct blocks it can't use: it can't isolate the relevant part and in most cases it ignores them. Every valuable answer you gave in those videos produces zero citations. Adding chapters with descriptive titles takes five minutes per video and turns each section into an independent piece of content the AI can find and use — without redoing anything.

You have a 45-minute video where you explain everything a potential client should know about your service. You published it on YouTube, embedded it on your site, shared it on social media. And when someone asks the AI engine “how does service X work in industry Y”, that video doesn’t show up. Not even a fragment.

The reason isn’t that the content lacks value. The reason is that, to the retrieval system, that video is a monolithic block — a title, a description, maybe an automatic transcript. It’s a monolith and it isn’t citable. Not in the sense that the AI ignores it entirely, but in the sense that it has no handholds to extract the relevant part. If the answer to the user’s query is at minute 23, but the system has no way of knowing that minute 23 covers that specific topic, the entire piece of content gets treated as background noise.

The solution exists and is within everyone’s reach: video chapters with timestamps. Not the decorative ones you occasionally see in YouTube descriptions. Chapters with descriptive titles, paired with a segmented transcript, that turn a long video into a collection of standalone, citable chunks.

Why a video without chapters is content that can’t be broken up

The principle is the same that applies to any long piece of text content, and that in the world of RAG systems research is framed as a question of granularity:

“Choosing the appropriate retrieval granularity during inference can be a simple and effective strategy to improve the retrieval and downstream task performance of dense retrievers.”

Gao et al., 2024

The granularity of the retrieved content radically changes retrieval performance. In plain terms: if the system can choose to retrieve a 300-token block focused on a specific topic instead of a 5,000-token block that covers everything, the precision of the answer improves. And with precision comes a higher probability that your content gets selected and cited.

A video without chapters, from a retrieval standpoint, is like a web page without headings: a wall of text where the system doesn’t know where one topic begins and another ends. YouTube’s automatic transcript produces exactly this — a continuous stream of words with no semantic breakpoints. Even if the crawler indexes it, that monolithic text competes poorly against rival content that is already segmented into precise blocks.

How timestamps become metadata for retrieval

When you add chapters with timestamps to a YouTube video description, you’re doing more than improving navigation for the user. You’re creating structured metadata that indexing systems can read and associate with specific portions of the content.

The survey by Gao et al. states it explicitly:

“Chunks can be enriched with metadata information such as page number, file name, author, category timestamp.” — Gao et al., 2024

Chunks get enriched with metadata — and the timestamp is one of them. It’s not a marginal technical detail. A video chapter with a timestamp and a descriptive title is a chunk with three fundamental properties: a defined beginning and end (the timestamps), a semantic label (the chapter title) and specific content (the corresponding portion of transcript). These are the same properties that make a section with a well-written heading a high-value chunk on a web page.

The difference is that most websites already have headings on their pages, however much they could be improved. Most videos, on the other hand, have no chapters. This means the gap between those who use them and those who don’t is enormous — and the competitive advantage for whoever acts first is proportional to that gap.

Common mistake

Adding chapters isn’t enough if the titles are generic.

The chapter title is your video heading

Here lies the step many people skip. Adding chapters isn’t enough if the titles are generic. “Introduction”, “Part 2”, “Conclusions” are the video equivalent of those “Learn more” and “Find out more” headings I told you about in the article on title hierarchy in web pages — the AI reads them and finds no information about what that section contains.

The chapter title has to work like a query that the segment answers. Not “Our method” but “How the predictive analytics method works for the retail sector”. Not “Case study” but “How client X cut operating costs by 30% in 6 months”. The principle is identical to that of self-contained sections: the title tells the system what is being discussed, and the system uses that information to decide whether that block is relevant to the user’s query.

I analyzed 25 YouTube channels of Italian B2B companies a few weeks ago, before writing this article. Of those that published videos over 15 minutes long, only 3 used chapters. And of those 3, only one had truly descriptive titles. The other two had variations of “Part 1, Part 2, Part 3”. The field is practically empty.

Pro tip

The chapter title has to work like a query that the segment answers.

Segmented transcript: the piece that closes the loop

Chapters alone create the structure. But structure without text content doesn’t generate citable chunks. I covered this in the article on video and podcast transcripts — text is the only currency the retrieval system knows how to spend. The video stays invisible until it gets converted into text.

The difference between a monolithic transcript and a transcript segmented by chapters is the same difference between a page without headings and a page with descriptive headings. The monolithic transcript is a single block. The segmented transcript is a series of mini-articles, each tied to a chapter, each with its own topic, each citable independently.

In practice this means taking one step beyond simple automatic transcription. Take the text generated by YouTube or by your transcription service, cut it at the points corresponding to the chapters, clean up each segment by removing filler words and repetitions, and publish it all on the page with headings that mirror the chapter titles. At that point you’ve turned a 45-minute video into 8-10 standalone sections, each with its own descriptive heading, each with 300-500 words of focused content.

Why the timestamp adds an extra signal

There’s an aspect of timestamps that goes beyond simple segmentation. Timestamps carry temporal information that the more advanced retrieval systems can exploit:

“Assigning different weights to document timestamps during retrieval can achieve time-aware RAG, ensuring the freshness of knowledge and avoiding outdated information.”

Gao et al., 2024

Timestamps allow the system to weight content based on its temporal placement, ensuring freshness and avoiding outdated information. Applied to video: a chapter with a timestamp and publication date gives the system a signal of when that content was created. In a sector where information changes rapidly, this signal can make the difference between being cited and being discarded in favor of more recent content.

The beauty is that every time you update a video or publish a new one with the same updated thematic chapters, you’re telling the system: “this is the most recent version of my answer on this topic”. It’s a mechanism that the static text of a blog post doesn’t have — a video with updated chapters combines content freshness with precise segmentation.

What to do with your next videos

Take the next video you publish — or the most recent one if you don’t have any coming up soon. Look at the content and identify the 5-8 moments where the topic changes. For each one, write a title that is a specific answer or question, not a generic label. Then add the timestamps in the YouTube description in the format that activates the platform’s native chapters (00:00 for the first, then each change).

Do the same with the transcript: segment it at the chapters, clean up each block, and publish it on the page of your site where the video is embedded. Each section with its own heading that mirrors the chapter title. As I explained when discussing infographics with parallel text and informative captions, every non-textual element needs its anchor in text. For videos, that anchor is the transcript segmented by chapters.

This is a first step you can take on your own. For a systematic strategy — optimizing chapter titles for retrieval, VideoObject schema markup with the segments, integration with the overall structure of the site — you need a big-picture view and tools that analyze how your video content is actually processed by AI crawlers. But even with descriptive chapters and a segmented transcript you’re turning invisible content into a collection of chunks that the AI can find, evaluate and cite individually.

And since almost no one does it, the advantage goes entirely to whoever starts now.

Chapter 3 · Content Structure for AI

Continue with the deep dives

39 deep dives across the 5 sections of the chapter.

3.1 Answer Patterns 8 deep dives

The AI Looks for the Phrase ‘X is…’ on Your Page, and Moves On if It Can’t Find It If Your Industry Has Pairs to Compare and You Don’t, the AI Cites Someone Else Are Your Guides a Wall of Text? AI Can’t Extract Them as an Answer Do Your FAQs Have One-Line Answers? To AI They’re Unusable Your content explains the ‘what’ but not the ‘why’? AI ignores it Are your lists random? AI ignores them and cites whoever has clear criteria Your content has no numbers? AI considers it less trustworthy Only talk about the benefits? The AI classifies you as promotional

3.2 Citable Formats 7 deep dives

Is the key information buried in plain text? With a callout, the AI extracts it first Are your comparisons written in prose? As a table they’d be 10x more citable Schema markup isn’t just for Google: AI uses it as a ready-made summary Do You Cite Your Sources? AI Treats You as a Higher-Tier Resource Is your key information buried only in the text? With JSON-LD, AI reads it without errors Does your best content only exist as web pages? As PDFs it becomes a standalone asset Only evergreen guides? You’re losing the citations on industry news

3.3 Linking & Semantic Context 8 deep dives

The Same Content Lives on Three Different URLs? The AI Doesn’t Know Which to Choose Does your site have coverage gaps? Competitors fill them and the AI picks them Your Most Important Page Has Fewer Internal Links Than a Secondary One? The AI Gets Confused Your links say ‘click here’? AI can’t tell where they lead Your links jump from one topic to another? AI perceives expertise in none Adding links without explaining why? The AI doesn’t understand the relationship Are your related articles picked by an algorithm? To AI they’re worth almost nothing Is your content a set of isolated pages? The hub and spoke model organizes it for AI

3.4 Multimodal Content 8 deep dives

Your flowcharts are beautiful images that AI can’t read Your videos have no chapters? The AI can’t cite the right part You are here Want AI to cite you more? Build a tool other sites want to embed Are your podcast show notes a three-line outline? You’re wasting an asset Do your infographics have alt text like ‘sales chart’? To AI, they don’t exist Got hours of excellent video? Without a transcript, they don’t exist to AI Your infographics are beautiful but to AI they don’t exist Do your captions say ‘Sales chart’? With the right numbers, they become citable

3.5 Page Architecture 8 deep dives

If the answer is in paragraph 8, the AI will never find it Every section of your page must be a mini-article the AI can cite on its own AI doesn’t read your generic headings: it ignores them Your article has no table of contents? The AI is searching for answers in the dark You’re Wasting Your Page’s First Viewport on a Decorative Banner AI can’t tell where your page sits without breadcrumbs Want AI to cite your article? Give it a TL;DR to copy Your sidebar is polluting the content the AI extracts

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →