Content Structure for AI

Do your infographics have alt text like ‘sales chart’? To AI, they don’t exist

Roberto Serra 25 June 2026·~8 min read

You've published infographics with market data, comparisons, original results — but if the image's alternative text only says 'sales chart', to AI that data doesn't exist. The model doesn't see images: it only reads the text you've placed on top of them. You're hiding your best data exactly where AI looks. Rewriting those alternative texts for an entire page takes twenty minutes — and turns every image into content that models can finally cite.

There’s an uncomfortable truth about your site’s images: AI doesn’t see them. It doesn’t interpret them, doesn’t analyze them, doesn’t extract data from them. When a RAG system processes your page, images are holes. Empty spaces. The only thing the model reads is the alternative text — that alt attribute that many people fill in with “company logo” or “results chart” and treat as an accessibility obligation to get out of the way quickly.

The point is that this attribute isn’t an obligation to get out of the way. It’s content. It’s the only bridge between what the image shows and what AI manages to understand. If the bridge is empty or generic, to the model the image simply doesn’t exist — and with it, the data, relationships, and evidence you worked hard to produce disappear.

Why alt text is content, not decoration

When I talk about alt text with my clients, the most frequent reaction is “yes, we add it for accessibility.” And that’s true: it was born as a tool for screen readers, to allow those who can’t see the image to understand its content. But the mechanism that makes it useful for a screen reader is the same one that makes it useful for an AI crawler: both read text, not pixels.

A RAG system converts your page into flat text. Images are replaced by their alt text — or by nothing, if the alt text is missing. This means that an infographic with five key data points, if it has “data infographic” as alt text, contains zero information for AI. Those five data points don’t exist in the text the model processes.

In the research world, the principle is clear. Volpini et al. in 2026 defined what happens when a page’s content is made explicitly readable:

“Enhanced pages transform opaque entity URIs into readable, structured information.”

Volpini et al., 2026

“Readable, structured information” — that’s exactly what well-written alt text does for an image. It transforms something opaque (the image, which AI can’t decode) into readable, structured information. Without this transformation, the visual content of your page stays opaque — and opaque, for a retrieval system, means nonexistent.

The problem of poorly formatted documents

You might think AI compensates somehow. That the model understands from context what’s in the image. But it doesn’t work that way. Volpini et al. again measured a figure that changes the perspective:

“The agent provides substantial lift on poorly formatted documents.”

Volpini et al., 2026

Translated into practice: the AI agent has to work harder — and produces less reliable results — when the document isn’t well formatted. A missing or generic alt text is poor formatting. The system doesn’t have enough context to understand what the image adds to the discussion, and the result is that it ignores that part of the page or interprets it loosely.

On the other hand, when the document is already optimized — descriptive alt text, clear semantic structure, explicit content — the agent reads without having to compensate. It’s not a marginal advantage: it’s the difference between content that gets extracted accurately and content that gets discarded.

Common mistake

This means that an infographic with five key data points, if it has “data infographic” as alt text, contains zero information for AI.

How to write alt text that AI can cite

Here we get to the practical part. Most of the alt text I see on the sites I analyze falls into three types: empty (no alt attribute, or alt=””), generic (“product image”, “chart”, “team photo”), or keyword-stuffed (“best-seo-agency-milan-digital-consulting”). None of these work for AI. The first and second contain no useful information. The third contains keywords, but doesn’t describe anything — and a model reading “best-seo-agency-milan” doesn’t understand what the image shows, it only understands that someone tried to cram keywords where they weren’t needed.

Alt text that works is a complete sentence describing what the image shows in the context of the article. It’s not a keyword, it’s not a label — it’s an informative description that adds real content to the page.

A few concrete examples. A bar chart showing organic traffic trends quarter by quarter: the alt text isn’t “traffic chart”, it’s “Bar chart showing organic traffic growth from 12,000 to 34,000 monthly sessions between Q1 and Q4 2025”. A screenshot of an AI result citing your brand: it’s not “ChatGPT screenshot”, it’s “Screenshot of an AI model’s response to the query best SEO consultants in Italy, where brand X appears as the top recommendation”.

The difference is enormous. In the first case, AI sees an informative sentence with specific data — numbers, trends, a time period. It can extract it, can cite it, can use it as evidence in an answer. In the second case it sees two words that add nothing to the page’s context. And in a system where the selection of content to cite depends on the information density of each chunk, an empty alt text is a wasted opportunity.

Chen et al. in 2025 summed up this principle directly:

“We provide actionable guidance for practitioners, emphasizing the critical need to: (1) engineer content for machine scannability.”

Chen et al., 2025

“Engineer content for machine scannability” — designing content so that it can be scanned by machines. Alt text is one of the most immediate tools for doing this. It doesn’t require restructuring the site, it doesn’t require development skills. It requires rewriting one line of text for each image. But that line of text is the difference between an image that exists for AI and one that doesn’t.

Pro tip

A bar chart showing organic traffic trends quarter by quarter: the alt text isn’t “traffic chart”, it’s “Bar chart showing organic traffic growth from 12,000 to 34,000 monthly sessions between Q1 and Q4 2025”.

The test you can run in five minutes

Open the five most important pages of your site. Right-click, inspect element, look for the alt attributes on the images. Count how many are empty, how many only say “image” or the file name, how many contain a complete sentence describing the content.

If most fall into the first two categories, you’re losing citable content on every page. It’s not an accessibility problem — it’s a visibility problem. AI is processing your pages and skipping every image because it has no way to understand what they show.

A second check: take the alt text of each image and read it out of context, as if you couldn’t see the image. Does it make sense? Does it describe something specific? If you read only that, would you understand what’s in the image and what information it adds to the article? If the answer is no, that alt text isn’t working for accessibility or for AI. And if it doesn’t work for a screen reader, it won’t work for a crawler either — the reading mechanism is the same.

This is a first step to identify where you’re losing ground. But systematically rewriting alt text on a site with hundreds of images, each with its own editorial context and its own role on the page, is work that requires method and an overall vision — a formula isn’t enough.

The thread with other multimodal content

Alt text is the starting point, but the problem is broader. Every non-textual piece of content on your site — video, podcast, infographics, diagrams — has the same fundamental limit: AI doesn’t see it unless you turn it into text.

I discuss this in the deep dives where I covered every angle of multimodal content. Video and podcast transcripts solve the same problem for audio and video content. Infographics with parallel text address the case of complex visuals that need a dedicated text version. Informative captions show you how even a line beneath an image can become a citable micro-chunk. And diagrams as structured text close the loop on flows and org charts that today are probably invisible to retrieval.

The underlying principle is always the same: anything that isn’t text doesn’t exist for AI. And at a time when visibility in AI answers depends on the quantity and quality of textual content the system manages to process, leaving images without a description is like publishing an article with censored paragraphs.

Alt text is the first action — the simplest, the most immediate — to start making visible what is today a black hole in your pages. It doesn’t require a redesign, it doesn’t require technical migrations. It requires looking at each image and asking yourself: if someone couldn’t see it, what would they need to know? The answer to that question is your alt text. And every image you make readable is one more piece of content that can be found, extracted, and cited when someone asks the right question of the AI engine.

Chapter 3 · Content Structure for AI

Continue with the deep dives

39 deep dives across the 5 sections of the chapter.

3.1 Answer Patterns 8 deep dives

The AI Looks for the Phrase ‘X is…’ on Your Page, and Moves On if It Can’t Find It If Your Industry Has Pairs to Compare and You Don’t, the AI Cites Someone Else Are Your Guides a Wall of Text? AI Can’t Extract Them as an Answer Do Your FAQs Have One-Line Answers? To AI They’re Unusable Your content explains the ‘what’ but not the ‘why’? AI ignores it Are your lists random? AI ignores them and cites whoever has clear criteria Your content has no numbers? AI considers it less trustworthy Only talk about the benefits? The AI classifies you as promotional

3.2 Citable Formats 7 deep dives

Is the key information buried in plain text? With a callout, the AI extracts it first Are your comparisons written in prose? As a table they’d be 10x more citable Schema markup isn’t just for Google: AI uses it as a ready-made summary Do You Cite Your Sources? AI Treats You as a Higher-Tier Resource Is your key information buried only in the text? With JSON-LD, AI reads it without errors Does your best content only exist as web pages? As PDFs it becomes a standalone asset Only evergreen guides? You’re losing the citations on industry news

3.3 Linking & Semantic Context 8 deep dives

The Same Content Lives on Three Different URLs? The AI Doesn’t Know Which to Choose Does your site have coverage gaps? Competitors fill them and the AI picks them Your Most Important Page Has Fewer Internal Links Than a Secondary One? The AI Gets Confused Your links say ‘click here’? AI can’t tell where they lead Your links jump from one topic to another? AI perceives expertise in none Adding links without explaining why? The AI doesn’t understand the relationship Are your related articles picked by an algorithm? To AI they’re worth almost nothing Is your content a set of isolated pages? The hub and spoke model organizes it for AI

3.4 Multimodal Content 8 deep dives

Your flowcharts are beautiful images that AI can’t read Your videos have no chapters? The AI can’t cite the right part Want AI to cite you more? Build a tool other sites want to embed Are your podcast show notes a three-line outline? You’re wasting an asset Do your infographics have alt text like ‘sales chart’? To AI, they don’t exist You are here Got hours of excellent video? Without a transcript, they don’t exist to AI Your infographics are beautiful but to AI they don’t exist Do your captions say ‘Sales chart’? With the right numbers, they become citable

3.5 Page Architecture 8 deep dives

If the answer is in paragraph 8, the AI will never find it Every section of your page must be a mini-article the AI can cite on its own AI doesn’t read your generic headings: it ignores them Your article has no table of contents? The AI is searching for answers in the dark You’re Wasting Your Page’s First Viewport on a Decorative Banner AI can’t tell where your page sits without breadcrumbs Want AI to cite your article? Give it a TL;DR to copy Your sidebar is polluting the content the AI extracts

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →