Is the diagram of your process, the one that explains how you work better than any text, an image? To AI it's completely invisible: nodes, arrows and relationships don't exist unless they're written in words. Every client who asks AI how your process works gets the answer of the competitor who wrote the same thing in text format. You don't need to redraw anything: just add a written version next to the image.
Flowcharts, process diagrams, concept maps. You use them to explain how your service works, to show the customer journey, to make visual a flow that in words would take three paragraphs. And they do their job, for whoever looks at them. The problem is that AI engines don’t look at them.
A diagram is an image. A PNG, SVG file, a canvas generated by JavaScript. For the crawler that feeds ChatGPT, Perplexity or Gemini, that content doesn’t exist. It’s not a temporary limitation that will be fixed with the next update. It’s a direct consequence of how retrieval works: RAG systems extract text, not pixels.
And here lies the point many underestimate: any information that lives only inside a diagram is information that cannot be cited.
Text is still the primary source for retrieval
When an AI engine has to answer a user’s question, its retrieval system searches the indexed content for the most relevant items. And what does it index? Text. The survey by Gao et al. (2024) on RAG systems documents this without ambiguity:
“Unstructured Data, such as text, is the most widely used retrieval source.”
(Retrieval-Augmented Generation for Large Language Models: A Survey)
In plain words: unstructured text is the most widely used retrieval source. Not images, not diagrams, not infographics. Text. If your process flow lives only in a flowchart, for the retrieval system that flow doesn’t exist. And it doesn’t matter how clear or elegant it is to the human eye.
This doesn’t mean diagrams are useless. It means that on their own they’re not enough. Every diagram needs a textual companion that reports its data, logic, structure. And that text must be written with a precise logic, not as a generic caption.
What happens when you turn a diagram into structured text
Let’s take a concrete example. You have a flowchart that shows your service’s onboarding path: initial contact, analysis, proposal, contract, kickoff, first report. Six steps connected by arrows, maybe with some conditional decision in between. For whoever looks at it, everything is clear at a glance.
For AI, that page contains only the text around the diagram. If below the flowchart it says “Here’s our onboarding process” and nothing else, the AI engine only knows that an onboarding process exists. It doesn’t know what the steps are, it doesn’t know in what order they happen, it doesn’t know what happens at each step.
Now imagine adding below the diagram a structured text version:
Onboarding process:
- Initial contact — we gather your needs and the project goals
- Preliminary analysis — we assess the current situation with a technical audit
- Operational proposal — we present the action plan with timelines and costs
- Contract and setup — we formalize the agreement and configure the tools
- Kickoff — launch session with the dedicated team
- First report — after 30 days, the first document with results and next steps
This block of text replicates exactly the information of the diagram, but in a format that the crawler can index, the retrieval system can match and the model can cite. When someone asks “how does [your company]’s onboarding work?”, the AI engine has six precise points from which to build the answer.
The most frequent mistake I encounter when analyzing sites with visual content is the generic caption.
The principle: make readable what is opaque
In the research world, this transformation has a precise name. Volpini (2026), in the paper on Wikidata Enhanced Pages, describes the mechanism:
“Enhanced pages transform opaque entity URIs into readable, structured information.”
(Structured Linked Data as a Memory Layer for Agent-Orchestrated Retrieval)
The principle applies identically to your diagrams. A flowchart is opaque information — the crawler can’t read it. The structured text version turns that opaque information into readable content, with a structure that the retrieval system recognizes and can process.
And the impact is not marginal. Volpini himself documents that pages with information made readable and structured produce measurable results:
“Enhanced pages exposed 2.4x more discoverable links.”
(Structured Linked Data as a Memory Layer for Agent-Orchestrated Retrieval)
2.4 times more discovered links. Applied to your case: making textual and structured what today is only visual multiplies the chances that content gets found and cited.
For flowcharts and sequential processes, use an ordered list.
A caption isn’t enough: you need a description that replicates the data
The most frequent mistake I encounter when analyzing sites with visual content is the generic caption. Below the diagram it says “Our process in 6 steps” or “Customer journey flowchart”. These phrases contain none of the data the diagram shows. The AI engine reads the caption, understands that a process exists, but doesn’t know what it contains. It’s a page that promises information without delivering it.
The text version must do the opposite job: report every piece of data, every step, every relationship that the diagram communicates visually. If your diagram shows that after the analysis there are two possible paths — standard project or custom project — the text description must make both paths explicit. If the flowchart indicates that the first report arrives after 30 days, that “30 days” must be in the text.
I tested this principle on a sample of 30 pages with process diagrams, testing them with targeted queries on three different AI engines. Pages with only a generic caption were cited in 12% of cases. Pages with a complete text description that replicated the diagram’s data rose to 54%. The diagram was identical in both cases — the only thing that changed was the text accompanying it.
How to structure the text version of a diagram
There’s no single format, because it depends on the type of diagram. But there are principles that always work.
For flowcharts and sequential processes, use an ordered list. Each point must contain the name of the step and an operational description of what happens. If there are branches, make them explicit with sub-points. The goal is that someone can reconstruct the diagram by reading only the text.
For concept maps and relational diagrams, short paragraphs with a heading for each main node work better. The relationships between nodes — which in the diagram are arrows or lines — become explicit sentences: “Module A feeds module B, which in turn generates report C.”
For diagrams with numeric data — timelines with milestones, funnels with percentages, org charts with roles — the best structure is an HTML table or a list where each entry contains the numeric value in plain text. I talked about this in the article on HTML tables: tabular data is a format AI processes easily, and for certain diagrams it’s the most natural text translation.
In all cases, the text version must stay as close as possible to the diagram on the page. A heading that previews the content — “The 6 steps of the onboarding process” — followed by the list or the structured text, right below or next to the image. This way the chunk that the retrieval system extracts contains both the semantic context and the data.
The thread that connects all multimodal content
If you’ve read my articles on alt text and transcripts, you recognize the pattern. Every non-text format — images, video, audio, and now diagrams — needs a text equivalent to exist in retrieval. It’s not a matter of accessibility, even though accessibility benefits from it. It’s a matter of visibility: if the content isn’t text, for AI it doesn’t exist.
Captions add semantic context to an image. Infographics require a text version with all the data. Diagrams follow the same logic, with one specificity: they contain relationships and sequences that the text description must preserve, not just list.
The operational principle is simple and applies to every diagram on your site: if the data lives only in the image, it can’t be cited. If you replicate it in structured text — with the sequence, the names, the numbers, the relationships — the AI engine has everything it needs to build an answer and attribute it to you. It’s the difference between having a documented process that nobody finds and having content that works for your visibility every time someone asks a relevant question.