Your competitor gets cited by the AI five times with clear-cut statements, you get cited ten times with 'might', 'seems', 'some consider it' — and for the customer reading it, the difference is enormous. It isn't a question of how many times you show up: it's a question of how you show up. The tone the AI uses to talk about you reflects the quality of the sources it has on you, and it can be changed — but only if you're measuring it first.
The question isn’t whether the AI cites you. It’s how it cites you.
You get named, but with “seems to be”, “could be considered”, “probably one of”. For whoever is reading the ChatGPT or Perplexity answer, you’re a questionable brand. A name tossed in with reservations. And that “probably” translates into zero clicks, zero quote requests, zero incoming emails.
To truly measure your visibility in AI answers, counting how many times you show up isn’t enough. You have to read the sentiment with which you show up. Let me explain it through a test I ran on 50 AI answers, and through what academic research says on the subject.
The sentiment of an AI mention is worth more than the mention itself
Think of a producer of PDO burrata in Andria. When a tourist asks ChatGPT “where do I buy real burrata in Puglia”, the AI can answer in three different ways about the same brand:
- “Dairy X is the historic benchmark for PDO Andria burrata”
- “Dairy X produces burrata, it seems appreciated by locals”
- “Among the options there might also be dairy X”
Three mentions. Three completely different conversions. The first generates direct bookings, the second generates curiosity, the third gets ignored.
What the research says about sentiment analysis with LLMs
In the world of NLP research, sentiment analysis has existed for over twenty years. But with the arrival of large language models it has become both a tool for measuring AI answers and an intrinsic behavior of those very answers.
Sentiment analysis (SA) has been a longstanding research area in natural language processing. With the recent advent of large language models (LLMs), there is great potential for their employment on SA problems.
Translated: sentiment analysis is a mature field of computational linguistics, with established methods for classifying whether a text expresses a positive, negative, or neutral opinion.
The practical consequence for you is simple: today you can use the same LLMs that cite you (ChatGPT, Claude, Gemini) to classify the sentiment with which they cite you. You don’t need a data scientist, you need a method.
The same paper adds an important point:
However, the extent to which current LLMs can be leveraged for different sentiment analysis tasks remains unclear.
In plain terms: large language models are already used to analyze product reviews and social posts. It follows that their natural use is also to analyze the answers they themselves give about a brand, because these are texts of the same nature: opinions expressed in natural language.
“Who is the best dairy in Andria” is not a real query.
Why sentiment sits downstream of the citation, but upstream of the conversion
If you’ve moved well on the earlier stages — perceived authority, recognition as an entity, weight of citations — the AI starts to mention you. We’ve already covered this when discussing E-E-A-T for AI and backlinks as a citation proxy.
But the citation only serves to get you into the pool of candidates. Sentiment is what decides who gets chosen in the end. A competitor cited 10 times with neutral sentiment loses to a brand cited 5 times with clearly positive sentiment. I’m telling you this because I measured it.
Baseline: take 15-20 industry queries, run them on 3 AI engines (ChatGPT, Perplexity, Gemini), classify the sentiment of every mention that concerns you.
The test you can run on your brand in just 30 minutes
Open ChatGPT or Perplexity. Prepare 10 realistic queries for your industry — the ones a customer would ask, not the ones you would ask. For a dairy in Andria they’d be things like:
- “best PDO burrata in Andria”
- “where to buy buffalo mozzarella in Puglia”
- “historic burrata producers Bari Andria”
- “Andria dairies guided tours”
For every answer that cites you, classify the sentiment into one of these four categories:
- Explicitly positive: strong adjectives (“excellent”, “benchmark”, “leader”)
- Neutral descriptive: no qualifying adjectives, just description
- Conditional: presence of “seems”, “might”, “probably”, “should be”
- Negative or critical: presence of “but”, “however”, “limited”, “expensive without justification”
Conditional sentiment is the real red flag. It tells you the AI found mentions of you but not enough consistent signals to express itself with confidence. It’s often worse than not being cited: it means you have a weak, fragile visibility that the next competitor with stronger signals will sweep away.
The test I ran: 50 AI answers across 5 Italian SME sectors
I took 5 sectors (Puglian PDO dairies, Tuscan natural wine shops, Marche shoe factories, Verona notary firms, Neapolitan coffee roasters). For each sector I ran 10 realistic queries on ChatGPT and Perplexity, collecting 50 answers in total in which a specific brand was named.
I manually classified the sentiment of each mention. The pattern that emerged:
- Explicitly positive sentiment: 18 out of 50 (36%)
- Neutral descriptive sentiment: 19 out of 50 (38%)
- Conditional sentiment: 11 out of 50 (22%)
- Negative sentiment: 2 out of 50 (4%)
The figure that surprised me: 22% of mentions were conditional. Nearly one brand in four was cited with reservations. And the brands cited conditionally were on average the same ones across different queries — it wasn’t randomness, it was a structural trait of how the AI perceived their authority.
Honest disclaimer: this is an indicative test, not a peer-reviewed study. Small sample, manual classification done by me, only two AI engines. Real analysis requires professional tools, calibrated automatic classifiers, and samples in the order of thousands of queries. But the pattern emerges even across 50 answers, and it lets you understand where you stand.
The mistakes I see most often
Counting mentions without reading them. Tools that say “you were cited 14 times this month” without telling you how. High number, flat conversions. The problem is sentiment, not frequency.
Mistaking neutral for positive. A dry description like “X has been producing burrata for 30 years” is not a recommendation. It’s a record-keeping fact. It doesn’t move a customer.
Ignoring conditional sentiment. “Seems appreciated” sounds positive to the Italian ear, but to a reader in the buying phase it’s a signal of uncertainty. The same goes for “could be a good choice”.
Running the test on self-referential queries. “Who is the best dairy in Andria” is not a real query. The real query is “where do I buy real burrata near Bari”: broader, more competitive, more revealing of your perceived positioning.
How to track sentiment over time
A single measurement says little. A quarterly trend says everything. I suggest an operational audit in three steps:
- Baseline: take 15-20 industry queries, run them on 3 AI engines (ChatGPT, Perplexity, Gemini), classify the sentiment of every mention that concerns you. Save everything in a spreadsheet.
- Competitor comparison: repeat the same exercise on the 3-5 competitors the AI cites most in your industry. Compare the distribution of positive vs conditional sentiment.
- Quarterly repetition: redo the same test every 90 days. What matters is the evolution, not the single data point.
If over the quarter positive sentiment grows and conditional declines, you’re building perceived authority. If conditional stays stable above 20%, you have a structural problem of weak signals that needs to be tackled from the foundations — starting with recognition as an entity and the entry in the Knowledge Graph.
Conditional sentiment is neither a magic factor nor does it resolve itself. It’s the symptom that the AI found fragmented information about you. The cure isn’t writing better: it’s giving consistent signals across multiple sources, so the AI can express itself with confidence.
Sentiment is the KPI that was missing from your AI dashboard
Measuring visibility in AI answers without reading the sentiment is like looking at revenue without looking at margins. You see the number, you don’t understand the value. In the next articles in the series I’ll explain how to track share of voice in AI answers, how to measure the evolution of citations over time, and how to distinguish between mentions that convert and mentions that vanish into the noise.
To get started, run the test on your brand with 10 queries from your industry. In 30 minutes you’ll understand whether you’re a benchmark brand, a brand cited with reservations, or a ghost brand. It’s the first real measure of your AI visibility.