How AI engines think

Writing Too Complex? AI Struggles More to Use Your Content

Roberto Serra 25 June 2026·~7 min read

You've written technical, in-depth content about your industry — yet a competitor who writes more simply beats you in AI answers even when the information is the same. When AI has to choose between two equivalent sources, it always prefers the clearer and more direct one: it's a measurable technical advantage, not a quality judgment. You're losing visibility not because you know less, but because you write in a way that AI struggles to process. Simplifying your existing texts is a quick fix — and it can radically change how often you get cited.

There’s a metric that AI uses — often without you knowing anything about it — to evaluate how easy it is to work with your text. It’s called the perplexity score. And if you write in a convoluted way, with long sentences, unnecessary jargon and chaotic structure, you’re unconsciously signaling to the AI that your content is hard to process.

The consequence is concrete: when two sources say the same thing, the one with lower perplexity — that is, the clearer, more direct, more predictable one — is preferred. It’s not a matter of style. It’s a measurable technical advantage.

This is the first of my articles dedicated to AI metrics. Before talking about log-probability, BLEU/ROUGE or Citation Accuracy, it makes sense to start here: with how language models evaluate and measure the text they encounter. Because understanding this mechanism changes the way you write — and therefore how you get cited.

The context: how AI searches for and evaluates sources

The search landscape has changed in a structural way. As a recent study documents, “the rapid adoption of generative AI-powered search engines like ChatGPT, Perplexity, and Gemini is fundamentally reshaping information retrieval” (Mahe Chen et al., 2025).

This is not an interface change. It’s a change in the way information is selected, weighted and returned. AI engines don’t return links: they build answers by synthesizing sources. And to do this they have to evaluate each source across multiple dimensions — including how easy it is to extract it, rephrase it and use it without introducing errors.

Perplexity is one of those dimensions.

The starting point, then, is not “how do I please the algorithm.” It’s: is my content written in a way that a model can use it effortlessly? Because if the answer is no, the model will use someone else’s content. Your brand doesn’t appear, doesn’t get cited, doesn’t exist in the answer the user reads. And this happens regardless of how accurate, complete or authoritative what you’ve written is.

What perplexity is and what it really measures

In statistics, perplexity measures how “surprised” a model is by a sequence of text. As the technical literature specifies, “from a statistical perspective, Perplexity is inversely proportional to the likelihood function” (Zhou et al., 2024). The higher the probability assigned to the text, the lower the perplexity — and vice versa.

In practice: a text with a predictable structure, clear sentences and a linear logical progression has low perplexity. The model “expects it,” in the sense that each word follows from the previous one with high probability. A text with logical leaps, obscure jargon, nested subordinate clauses and an incoherent flow of discourse has high perplexity — the model is continually surprised, it has to do more work to build a coherent internal representation.

Let me give a direct example. “Tokenization converts text into numerical units” has low perplexity. “The tokenizing process transforms the textual input, through subdivision, into numerically encoded sub-lexical representations” says the same thing with high perplexity. For the model, the first text is easier to extract, cite and rephrase. The second requires more work and has more room for error.

A necessary caveat: the most recent research notes that “perplexity can be unreliable, especially when input length varies” (Wang et al., 2026). This doesn’t invalidate the metric — it means it shouldn’t be used as an absolute number, but as a relative signal. At equal length, between two texts that say the same thing, the one with lower perplexity is processed better.

Common mistake

The paradox is that many professionals write in a complex way, convinced that it makes them look more authoritative.

Why this translates into AI visibility

The link between perplexity and citation is not as direct as a classic SEO ranking factor. It’s more subtle, and it works on three levels.

More precise extraction. Models retrieve information by splitting documents into chunks. A chunk with clear language provides precise, isolatable information. A chunk with convoluted language offers information embedded in syntactic noise — the model can extract it, but with less certainty.

Rephrasing with less distortion. When the model paraphrases a source, it transforms the original text into something suited to the answer it’s building. A clear text gets rephrased with less risk of losing the original meaning. A high-perplexity text is more likely to be poorly paraphrased, resulting in a loss of attribution or — worse — the introduction of an inaccuracy.

Implicit quality signal. During training with human feedback (RLHF), evaluators prefer clear and direct answers. The model learns to associate clarity with quality. It follows that, when it chooses between comparable sources, it tends to favor those with linguistic characteristics similar to the answers it has learned to produce.

The paradox is that many professionals write in a complex way, convinced that it makes them look more authoritative. For the AI it’s the opposite: unnecessary linguistic complexity is a signal of low quality, not of competence.

There’s a second paradox, more subtle. The companies that invest the most in producing content — long guides, whitepapers, industry reports — tend to write in a more elaborate way. The more formal the document, the more the prose gets stuffed with nominalizations, passive forms and bureaucratic structures. The result is that the most expensive content to produce is often the content with the highest perplexity — and therefore the content that AI struggles most to use.

Pro tip

If a sentence contains two distinct concepts, break it into two.

How to apply this to your content

You don’t need technical tools to measure perplexity. You need practical criteria to reduce it systematically.

One sentence, one concept. If a sentence contains two distinct concepts, break it into two. AI processes texts with short sentences and a subject-verb-object structure better.
Jargon only when it adds precision. “Process” is better than “systemic operational process.” The technical term is justified when the generic term would lose precision — not to look more expert.
A predictable structure for each section. Introduce the concept → develop it → draw the practical implication. The model is trained on texts with this recurring structure: it recognizes it, processes it, extracts it better.
Active voice instead of passive voice. “AI selects the sources” has lower perplexity than “The sources are selected by the AI through a selection process.” The active voice is more predictable for the model.
The read-aloud test. Any sentence that needs to be re-read to be understood has perplexity that’s too high. If you have to go back, the model does the same — with less tolerance than you.

A practical audit you can do today

Take the three pieces of content on your site that you want to surface in AI answers. For each one, apply this four-step process:

Read every section aloud, sentence by sentence. Mark every sentence that doesn’t flow on the first listen.
Every sentence you marked: simplify it until you can read it without hesitation.
Every paragraph with more than three distinct concepts: break it into separate paragraphs, one per concept.
Every technical term without an immediate explanation: add the explanation in the same sentence or in the next one.

After this audit, your content will have a structurally lower perplexity. Not because you simplified it — but because you made it more processable. There’s a difference: clear content on a complex topic is still expert content. Convoluted content on a simple topic doesn’t become expert because of it.

Clarity is the minimum technical requirement for existing in AI answers. The other metrics — like the log-probability we’ll dig into in the next article — work on a text the model can already process. If perplexity is high, those metrics become irrelevant.

Start here. If you already work with a consultant on content strategy or AI visibility, ask them to add a check on syntactic complexity to the review process — not as a stylistic element, but as a technical requirement. It’s not a marginal variable: it’s the entry filter everything else has to pass through.

Want to understand why your content doesn’t appear in AI answers even when you’re the recognized expert in your industry?

Follow me in my upcoming deep dives!

Chapter 1 · How AI engines think

Continue with the deep dives

38 deep dives across the 5 sections of the chapter.

1.1 AI Reasoning 8 deep dives

Step-by-step guides: why AI loves them (and how to write them) AI Agents and APIs: Your Business Can Become a Service the AI Calls Is AI inventing things about your brand? It happens when it can’t find reliable data Cover the Whole Workflow or the AI Ignores You (and Picks Another Source) Whoever Gets Cited in ChatGPT’s First Turn Has an Edge Over Everyone Else If the AI says ‘might’ when talking about you, you have a trust problem If your brand info contradicts itself, AI picks a competitor ‘Recommend the best X in Y’: does your content match this query?

1.2 Evaluation & Scoring 8 deep dives

Writing Too Complex? AI Struggles More to Use Your Content You are here How to Become the Brand AI Generates Automatically for Your Industry Want AI to rephrase you? Write the answer exactly as you want it Exaggerated data on your site? AI discards it and picks whoever is more honest Your title says one thing, your content another? AI notices and penalizes you Logical gaps and contradictions? AI lowers your content’s score Who Is Your Brand Cited With? This Determines Your AI Category Are you rewriting what everyone else has written? AI wants novelty

1.3 LLM Architecture 8 deep dives

AI Replies With Outdated Data About Your Brand? Here’s Why It Happens Is your brand invisible to ChatGPT? The problem starts with how it reads it AI reads your page like a book: it skips the middle How AI Decides Which Words Matter Most on Your Page If your page is too long, the AI cuts it and loses you Why ChatGPT Always Recommends the Same Brands (and How to Get on the List) The semantic distance between you and your customer decides whether AI finds you For AI, your page structure matters more than length

1.4 Retrieval & Grounding 7 deep dives

Perplexity and Bing Chat search in real time: are you in their index? Exact keywords or synonyms? AI needs both (here’s why) AI doesn’t read your whole page — it slices it into chunks After retrieval comes reranking: this is where generic content loses Want AI to cite your site by name and with a link? Here’s what you need to give it AI rewrites the question before searching: is your content ready? AI combines multiple sources to answer: are you in at least 2 of them?

1.5 Training & Alignment 7 deep dives

Useful, accurate and safe: the 3 criteria AI uses to judge your content The AI’s Internal Filters Can Block Your Site Without Warning Is your industry underrepresented in the training data? AI already starts at a disadvantage Vertical AI models: if you’re not in their data, you don’t exist in their world Copied content? The AI keeps the original and discards yours The perfect answer according to AI: structured, specific, with sources Aggressive SEO in 2026? AI Safety Filters Are Already Penalizing You

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →