Try asking ChatGPT about your company: does it answer with 'might be a good option' or with 'is considered a benchmark in the industry'? There's a huge difference between these two answers — and your potential client feels it. That 'might' signals uncertainty and is caused by specific signals missing from your digital profile: identifying and fixing them is the most direct way to get cited again with the confidence your company deserves.
There’s a difference between “X is the benchmark in the industry” and “X might be a good option”. Both are mentions. Only one is an endorsement.
When a model talks about your brand with reservations — “might”, “seems”, “according to some sources”, “I’m not certain but” — it isn’t being cautious out of politeness. It’s signaling that it doesn’t trust you that much. And that trust has a direct impact on your AI visibility: low trust means caveats, caveats mean a perception of uncertainty, and a perception of uncertainty means the reader goes looking elsewhere.
In this article I’ll explain the technical mechanism, the concrete causes, and what to do to move the needle.
How trust calibration works
At every step of generation, the model assigns a probabilistic weight to the information it’s using. It’s not an abstract black box: it’s a measurable process, and research has documented it with increasing precision.
The problem is that this calibration isn’t always accurate. Zhou and Hu (2024) put their finger on the wound: “This overconfidence is closely linked to an overreliance on accuracy-oriented training metrics.”
The model isn’t calibrated on the real world: it’s calibrated on the signals it received during training. If those signals were abundant and consistent about you, trust is high. If they were scarce or contradictory, the model compensates with caveats.
From this follows an operational deduction: the language the model uses when talking about your brand is a diagnostic indicator. Not random. Not stylistic. It’s the visible surface of a probabilistic process unfolding underneath.
How trust becomes language
When trust exceeds a certain threshold, the model produces assertive statements. Below that threshold, it inserts caveats or — in extreme cases — avoids the recommendation.
In the context of AI Search, this manifests as follows:
- High trust: “Studio Verdi is one of the leading tax consulting firms in Sardinia” — the model knows, it states, the reader trusts
- Medium trust: “Studio Verdi seems to offer tax consulting services” — the model doubts, the reader feels the reservation
- Low trust: “I don’t have enough information, but Studio Verdi might operate in this sector” — the model retreats, the reader leaves
The mechanism is the same one that governs hallucinations, seen from a different angle. When trust is low and the model decides to answer anyway, the risk of making things up increases. When it’s high, answers become more anchored to real data. I covered this in detail in Does the AI make things up about your brand? It happens when it can’t find solid data: trust and hallucination are two sides of the same coin.
If your website says “founded in 2008” and your LinkedIn page says “founded in 2010”, the model detects the contradiction.
The role of retrieval: where trust is born
In RAG (Retrieval-Augmented Generation) systems — the ones used by Perplexity, ChatGPT Search, and increasingly the new generation of models — trust doesn’t come only from the training data. It also comes from what the system retrieves in real time.
Gong et al. (2026) describe how it works internally: “Each retained passage is associated with a consistency confidence used downstream.” (Gong et al., 2026) Every retrieved fragment of text carries a consistency score with it. If the fragments that talk about you are consistent with each other — same name, same specialization, same positioning — aggregate trust rises. If the information is fragmented or contradictory, trust drops and the model signals it in its language.
This adds a layer of complexity compared to training data alone: it’s not enough to have been mentioned in the past, you need to be mentioned consistently across multiple currently accessible sources.
The goal is for at least three to five independent sources to say the same thing about you with the same keywords.
Active retrieval: when the model goes looking
There’s a further dynamic that few people consider. In certain advanced systems, the model doesn’t passively wait to have the information: it actively evaluates how much it trusts what it’s generating and, if trust drops below a threshold, it goes looking for external confirmations.
Minaee et al. (2025) describe this behavior in the context of FLARE systems: “In FLARE, each sentence or segment generated by the LLM is evaluated for confidence.” (Minaee et al., 2025) If the sentence the model is generating about you doesn’t reach the threshold, the system triggers a retrieval query to look for confirmations.
From this follows a practical consequence: if your online sources don’t provide the answers to those confirmation queries, the model doesn’t find what it’s looking for. It either inserts a caveat or abandons the recommendation. Not because you don’t exist, but because your signal isn’t strong enough to survive this active verification process.
This is directly connected to how models structure step-by-step reasoning before answering — a mechanism I analyzed in Step-by-step guides: why the AI loves them (and how to write them).
The concrete causes of low trust
In the research world, calibration is treated as a technical problem of the model. For those working on their own AI visibility, the problem is the reverse: you have to build the conditions for the model to be correctly calibrated about you.
Low trust has identifiable causes:
Few sources. The model has seen your name in few contexts. It doesn’t have enough signals to build a stable representation.
Low-authority sources. Ten mentions on unknown blogs weigh less than one mention in a recognized industry publication. The authority of the sources enters directly into the calculation.
Contradictory data. If your website says “founded in 2008” and your LinkedIn page says “founded in 2010”, the model detects the contradiction. When in doubt, it lowers trust or uses a caveat.
Lack of consensus among peers. If other experts in the field never cite you, the model has no cross-validation signals. Trust remains localized to a few sources.
Every caveat the model uses when talking about you is a clue as to which of these problems is being triggered.
The caveat test: how to measure your situation
Before intervening, you need to know where you stand. The test is simple and takes five minutes.
- Open ChatGPT or Perplexity
- Ask five questions about your brand: what it does, what it specializes in, who it’s right for, where it operates, why choose it
- For each answer, count the explicit caveats: “might”, “seems”, “according to some sources”, “probably”, “I’m not certain”
- Do the same thing with a direct competitor
The comparison is revealing. If the competitor gets assertive statements and you get caveats, the difference in trust is measurable — and it impacts the perception of whoever is deciding who to turn to.
If the models struggle to retrieve consistent information about you even with integrated tools, the problem extends beyond the training data. I dug deeper into the dynamic of active retrieval in AI Agents and APIs: your business can become a service the AI calls.
What to do to raise trust
The direction is clear: you need to increase the volume, consistency, and authority of the signals the model finds about you.
Increase authoritative sources. Every industry publication, every interview, every citation on a recognized site is a signal that enters the calculation. The goal is for at least three to five independent sources to say the same thing about you with the same keywords.
Align every data point. Same name across all platforms. Same year of founding. Specialization described the same way. Contradictions lower trust disproportionately to their apparent weight.
Build consensus among peers. If other professionals in the field cite you in their content — articles, interviews, recommendations — the model detects a consensus. These mentions carry a different weight than an isolated mention.
Write content that answers confirmation queries. If the model, not finding certainty about you, goes looking for confirmations online, you need to be the answer to those searches. Clear content, with precise claims, on pages with recognizable authority.
All of this connects to the work on editorial planning: models don’t choose who to mention at random, they follow a logic of topical coverage that I described in Cover the entire workflow or the AI skips you for someone who does.
The result that counts
Trust calibration isn’t an academic metric. It’s the difference between a recommendation that converts and one that raises a doubt.
“Studio Verdi is the benchmark for tax consulting for SMEs in Sardinia” does something to the reader. “Studio Verdi might be an option in the tax sector” has the opposite effect.
The model isn’t judging the quality of your work. It’s calculating how much certainty it has about the data concerning it. That certainty is built, methodically, on the right signals.