Authority and Credibility for AI

Spontaneous user recommendations outweigh any content you create

Does no one in your industry's forums or on Reddit spontaneously talk about your brand? Recommendations between real users carry disproportionate weight in AI: they are the validation that no company can produce for itself, and that is why models treat them as a signal of maximum reliability. Every organic conversation in which you get mentioned is worth more than dozens of self-promotional pieces of content. You cannot build those conversations directly — but you can create the conditions for them to happen.

You can write the best content in your industry. You can optimize every page, polish every technical detail, publish consistently. But when a user on Reddit writes “I use X and I’m really happy with it, I recommend it” — that single comment, for a language model, can be worth more than your entire company blog put together.

This is not hyperbole. It’s the result of how the models are built, of what they read during training, and of how they decide whom to trust.

Reddit is in the training data. Literally

The starting point is a documented fact. When it comes to what a language model’s training dataset contains, Reddit is not a marginal presence — it is an explicit component.

Gao et al. (2025) confirm it in their survey.

Reddit is listed alongside Wikipedia and books. It is not a secondary source pulled from a generic Common Crawl — it is a dedicated component of the dataset. This means that millions of threads, comments, recommendations and discussions between real users were processed by the model during training. Every “I recommend this service,” every “I tried three alternatives and the best is this one,” every thread comparing competitors — all of this has entered the model’s statistical memory.

And this is where it gets interesting for anyone who wants to show up in AI answers.

Why the model trusts spontaneous recommendations

To understand the weight these recommendations carry, we need to look at a second mechanism: how models learn to tell good answers from bad ones.

Yang et al. (2025) describe the process:

“RLHF enables the incorporation of human preferences into model training by using a reward model to guide reinforcement learning optimization.”Yang et al., 2025

RLHF stands for Reinforcement Learning from Human Feedback. In practice, after the initial training the model is fine-tuned using the preferences of real people: annotators evaluate pairs of answers and indicate which is better. The model learns to produce answers that reflect these human preferences.

Now, here is the deduction — and I’m presenting it as such, because no paper states this step explicitly. Spontaneous recommendations in online communities are, in effect, expressions of human preference in their purest form. When hundreds of users on a forum recommend the same tool, that is a signal of aggregate preference. The model processed that pattern during training. And then, during RLHF, it learned that answers aligned with human preferences are the “right” ones.

From this it follows that organic recommendations in communities work on two levels: they are in the training data as information, and they are consistent with the type of answer the model was trained to prefer. A double reinforcement that no branded content can replicate.

Common mistake

No amount of optimization can make up for the absence of a reason why people should recommend you.

The bias toward third-party sources is documented

If you’re wondering whether this deduction holds up, there’s an empirical data point that supports it quite clearly.

Chen et al. (2025) analyzed how AI engines select sources in their answers:

“AI Search exhibit a systematic and overwhelming bias towards Earned media — third-party, authoritative sources — over Brand-owned and Social content.”Chen et al., 2025

“Systematic and overwhelming bias.” It’s not a slight preference — it’s a massive skew. AI engines favor earned sources (spontaneous mentions, third-party articles, organic recommendations) over the content a brand produces for itself. Your company blog, your services page, your self-published case study — all of this falls under the “Brand-owned content” that the system tends to underweight.

Recommendations in communities, on the other hand, are earned media in its purest form. No one commissioned them. No one paid for them. They are real users who, spontaneously, wrote that your product or service works. For the model, that kind of signal has structurally superior credibility.

Pro tip

Reply publicly in the threads where your industry is being discussed — not to promote yourself, but to be useful.

Where the game is played

Not all communities carry the same weight. Reddit is the most documented case because it is explicitly in the training datasets, but the principle extends to any platform that is indexed and “crawled” regularly.

Vertical industry forums — the ones where the professionals of a specific niche gather — are particularly powerful because they combine two signals: the spontaneous recommendation and topical consistency. If a professional in your field recommends you on a forum dedicated to exactly your area, the semantic context is perfectly aligned. I discussed this when analyzing how peer endorsement works: the weight of a mention grows when it comes from a topically consistent context.

Professional communities on LinkedIn, industry Slack groups, themed Discord communities — anything that produces public, indexable content contributes to the signal. What is not indexable (private groups, closed chats) does not enter the training data and carries no weight.

Mentions on Wikipedia also follow a similar logic but with a different multiplier, as I explored in the article on Wikipedia as an authority hub. The difference is that Wikipedia brings institutional trust, while communities bring social trust — and you need both.

The mistake of those who look for shortcuts

The temptation is obvious: if spontaneous recommendations carry so much weight, just generate them artificially. Fake accounts on Reddit, fabricated reviews in forums, coordinated astroturfing campaigns.

The problem is that it doesn’t work. And not for ethical reasons — for mechanical ones.

The moderation systems of the major communities are sophisticated. Reddit has spam-detection systems that identify patterns of unnatural behavior. Industry forums have moderators who recognize new accounts that show up only to promote something. And even when artificial content gets past the platform’s filters, it has different linguistic characteristics from a genuine recommendation — less personal context, fewer specific details, more generic formulas.

Moreover, the model has processed millions of examples of both kinds during training. From this it follows — and this too is a logical deduction — that the linguistic pattern of an authentic recommendation and that of a fabricated one occupy different regions in the vector space. The first is associated with genuine discussion contexts, the second with spam contexts.

How to create the right conditions

The first step is a reality check. Search for your brand name on Reddit, search in your industry’s forums, search in the communities where your customers gather. Has anyone ever recommended you spontaneously? If the answer is no, the problem isn’t AI visibility — it’s product, service, customer experience. No amount of optimization can make up for the absence of a reason why people should recommend you.

If you find a few mentions, analyze the context. In which situations do people recommend you? For which specific problem? Those answers tell you exactly which aspect of your business generates word of mouth — and that is what you should double down on.

Then there are concrete actions that lower the barrier. Reply publicly in the threads where your industry is being discussed — not to promote yourself, but to be useful. Make it easy for satisfied customers to share their experience. Be present where the conversation happens, not only where you publish.

And here we return to the point I analyzed when discussing expertise validation: credibility is not declared, it is demonstrated. In communities the same principle applies, amplified. Those who participate, help, answer (and are then recommended by others) build a signal that no advertising campaign can replicate.

The signal you cannot buy

There is a precise hierarchy in the way AI weighs sources, and I discuss it in the article on the source tier hierarchy. Recommendations in communities occupy a particular position in this hierarchy: they don’t have the institutional trust of a .gov site or a government citation, but they have something those sources cannot have: the perceived authenticity of a user speaking from direct experience.

In a context where AI systematically favors third-party sources over self-produced ones, spontaneous recommendations are the hardest signal to build and the hardest to ignore. You cannot buy them, you cannot fabricate them, you cannot accelerate them with a budget. You can only earn them.

And if you earn them, the model has already read them.

Chapter 2 · Authority and Credibility for AI

Continue with the deep dives

40 deep dives across the 5 sections of the chapter.

2.1 Authority Signals 8 deep dives
2.2 Brand Authority 8 deep dives
2.3 Sources & Citations 7 deep dives
2.4 Technical Credibility 8 deep dives
2.5 Trust & Reputation 9 deep dives
The author
Roberto Serra at the Senate of the Republic Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”
Roberto Serra Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in
ANSA Il Sole 24 Ore Le Iene Università di Cagliari La Repubblica
How visible is your brand to AI? Analyze your brand