Measuring AI visibility

AI platform visibility: why an aggregate average leads your investment astray

Roberto Serra 25 June 2026·~7 min read

You look at the aggregate figure for your AI visibility, you see 26% and you think you're doing fine — but that number hides the fact that on ChatGPT you're at 8% while on Perplexity you're at 45%, and your customers mostly use ChatGPT. Averaging across different platforms is like averaging January revenue with July revenue: a number that tells you nothing useful. Breaking the data down by platform takes less than an hour and turns your strategy from generic into surgical.

I tracked an Italian B2B brand across five AI engines for four consecutive months. The citation numbers, at the end of the period, were these: ChatGPT 45%, Perplexity 28%, Gemini 12%, Claude 8%, Copilot 7%. Averages that shift every month, sometimes by as much as 10 points. If you stop at the aggregate number — “the brand shows up in 20% of AI queries” — you’re looking at a blurry snapshot that hides the very information you need to decide your budget.

Let me explain why per-platform visibility is the metric that sits upstream of every serious investment in GEO, and how to measure it without thousand-euro-a-month tools.

What it means to be visible “per platform”

Every AI engine is a different system. ChatGPT uses a set of sources and a way of synthesizing them that isn’t the same as Perplexity’s, which in turn isn’t Gemini’s. The crawlers differ, the weight given to explicit citations differs, the underlying indexes differ (some pull from Bing, others from Google, others from a mix).

In the world of research on retrieval and grounding systems, the principle is by now solidly documented: two retrievers trained on different objectives, or two models with different system prompts, return source sets with only partial overlap, often below 50%. From this it follows that your visibility is not a single quantity. It’s five quantities, one per platform, and they must be measured separately.

Translated into practice: you can be cited in 70% of the relevant queries on Perplexity and 10% on ChatGPT. The average says 40%. That 40% doesn’t exist in any operational reality — it’s a number that doesn’t tell you where to act.

Why it sits upstream of every other metric

In my previous articles I told you about metrics like AI Share of Voice and Query Coverage Rate. They’re useful KPIs, but if you calculate them on an aggregate pool of AI engines you lose the most important signal: the variance between platforms.

Variance is the figure that tells you where your strategy works and where it doesn’t. If you appear well on Perplexity but poorly on ChatGPT, you’re probably strong on backlinks from citation-worthy sites but weak on author entity recognition — ChatGPT weights certain signals more heavily, Perplexity others.

Without this breakdown, you spend your GEO budget at random.

Common mistake

The classic one: “I monitor myself on Perplexity because that’s the one I use.”

The test you can run in 45 minutes

You need three things: a spreadsheet, a list of 10-15 queries that one of your potential customers would type, and 45 minutes of your time.

Take the queries one by one and run them on all five platforms: ChatGPT, Perplexity, Gemini, Claude, Copilot. For each platform, note in one row:

Cited by name? yes/no
Cited with an active link? yes/no
Position in the answer: high / medium / low / not present

At the end you have a 15-query × 5-platform matrix. Count the percentages by column. That’s your real baseline.

Binary reading threshold: if the difference between the best platform and the worst exceeds 20 percentage points, you have a consistency problem. It means one channel is pulling the cart and the others aren’t — and the day that channel changes its algorithm, you lose visibility all at once.

To track queries over time you may find Google Search Console useful (to see whether Gemini/AI Overview is pulling you in) and, for an external check of the brand signals that AI engines use, Google Trends. The real analysis, done consistently, requires professional tools dedicated to AI monitoring — these tests are an honest first step, not a substitute.

Pro tip

Set your GEO budget based on your weakest platform, not the average.

The test I ran: four months, one brand, five platforms

I opened the article with the numbers; now let me explain the method so you can judge them.

The brand is a manufacturer from the Mirandola biomedical district, in the province of Modena — one of those SMEs that exports medical technology around half the world but that, on ChatGPT, barely showed up at the start of tracking. I built a set of 22 queries consistent with their typical customer (hospital purchasing managers, European distributors, biomedical engineers). I ran the same 22 queries every four weeks for four months, across the five platforms, always on the same day and time slot, in non-logged-in sessions.

The result, at the end of the period, is what you read above: ChatGPT 45%, Perplexity 28%, Gemini 12%, Claude 8%, Copilot 7%. But the interesting figure isn’t the final number — it’s the month-on-month variation. Claude went from 2% to 14% and then dropped to 8%. Gemini stayed flat. Perplexity swung between 22% and 34%.

Stated limits: a single brand, 22 queries, four months. It’s an indicative test, not a study. But the pattern of “large variance between platforms, large variance over time on the same platform” I’ve seen replicated with other clients in very different sectors (automotive component manufacturers in the Brescia area, wineries in Montefalco in Umbria, boutique hotels in Ogliastra). It’s not an isolated case.

The most common mistakes

When a company starts measuring AI visibility, the wrong patterns are always the same.

Using a single platform as a proxy for all of them. The classic: “I monitor myself on Perplexity because that’s the one I use.” Perplexity cites sources explicitly and is easy to read, but its real market share is a fraction of ChatGPT’s. You’re looking at the wrong platform for your audience.

Measuring once and stopping. Temporal variance on a single platform is high. A one-off test gives you a snapshot that in six weeks could be contradicted. You need a monthly cadence, at minimum.

Ignoring Copilot because “hardly anyone uses it”. In B2B, Copilot is actually far more present than you’d expect, because it’s built into Microsoft 365. If your potential customer works at a large company that uses Teams and Outlook, Copilot is the first AI they encounter.

Confusing “cited” with “cited well”. Being mentioned in the fifth paragraph of a long answer is worth far less than being in the first. A matrix without the “position” field loses half the signal.

Using queries that are too generic. “Best biomedical manufacturers” will always give you big names and very little signal for an SME. The queries that matter are the mid-to-bottom funnel ones: “suppliers of single-use heart valves Mirandola district,” “who makes ECMO catheters in Italy,” “alternatives to [leading competitor] for dialysis.” That’s where the real traffic that converts is, and that’s where the variance between platforms becomes visible and actionable.

Treating Claude as negligible. Claude has a small share but a high-value user base: technicians, lawyers, doctors, researchers looking for reliable answers. In specialized B2B it’s worth far more than the 8% that appears in the raw numbers. If you sell to professional clients, leaving it out of the matrix is a reading error, not a math one.

What can you do right now?

Define 15-22 queries representative of your ideal customer. Not the classic SEO keywords: real questions, long, with comparative or shortlist intent.
Build a query × platform × month matrix. Update it every 30 days, the same day of the month.
For each platform, measure three things: presence, active link, position in the answer.
Compare against the 3-5 competitors the AI cites in your sector. If on ChatGPT you’re at 10% and the top competitor is at 60%, you have a clear direction for where to work.
Set your GEO budget based on your weakest platform, not the average. If Gemini is your gap, work on the signals Gemini weights most (presence in the Knowledge Graph, organization schema, mentions in sources already indexed by Google).

Where all this connects

Visibility in AI answers is not a single metric. It’s a five-dimensional vector, and each dimension tells you a different story about your digital presence. Measuring it in aggregate is like measuring the fever of five people by taking the average: the number exists, but it doesn’t help you treat anyone.

In the following articles in this series I go into the detail of metrics specific to each platform: how to read AI Overview data in Search Console, how to build an AI tracking dashboard that breaks down by engine, and how to turn the visibility matrix into a monthly scorecard you can share with your team in two minutes.

The operational question to take away is just one: do you really know where the AI cites you, or are you looking at an average?

Chapter 7 · Measuring AI visibility

Continue with the deep dives

40 deep dives across the 5 sections of the chapter.

7.1 Competitive Benchmarking 8 deep dives

Competitor AI Audit: how to reverse engineer your rivals’ AI visibility The competitors winning in AI answers share 3 sources you’re missing Gap Analysis by Query Cluster: the 30 queries that separate you from your competitor are your editorial plan for the next 6 months New entrant detection: how to discover the competitors AI is starting to cite before you do Reverse engineering the competitor the AI cites most: how to turn their pattern into your map Industry Benchmark for AI Visibility: The Number That Gives Meaning to Your Share of Voice Seasonal AI Visibility Pattern: reading the cycles so you don’t mistake seasonality for a problem Bilingual AI visibility: why AI cites you in Italian but you vanish in English (or vice versa)

7.2 KPIs & Metrics 8 deep dives

AI Confidence Indicator: read how much the AI trusts you from the language it uses AI platform visibility: why an aggregate average leads your investment astray You are here AI Referral Traffic: the only AI metric you can already see today in Google Analytics AI Share of Voice: the metric that’s replacing market share AI Mention Sentiment: How the AI Cites You Matters More Than How Often AI Citation Accuracy Rate: How Often AI Tells the Truth About Your Brand AI Recommendation Position: your spot in the AI list is the new ranking Query Coverage Rate: the metric that tells you how often AI really recommends you

7.3 Reporting & Dashboard 8 deep dives

Monthly AI Visibility Scorecard: one page, six numbers, three months of trend Competitive Comparison Matrix: Winning on One AI Platform Isn’t Enough Anymore Which Sources the AI Cites You From: The Map That Tells You Where to Invest Quarterly Trend Analysis: how to tell if your AI visibility is truly growing Hallucination Tracking Report: turning the AI’s mistakes about your brand into data you can manage The format that unlocks the budget for AI visibility The AI visibility report that shifts the conversation with your client AI Alert: set up now the system that warns you when your brand disappears from the answers

7.4 ROI & Business Impact 8 deep dives

AI lead attribution: how to know how many customers really come from ChatGPT and Perplexity When Google Ranking Drops and AI Visibility Rises: the Signal You Must Learn to Read Cost per AI Mention: What It Really Costs You Every Time AI Recommends You What AI Visibility Level Are You At: The Maturity Model Your CEO Understands AI visibility budget: 10k, 30k, 100k a year? Here’s how to actually allocate it AI Visibility Forecasting: predict where your AI visibility will be in 6 months Channel Mix Optimization: How to Rebalance Budget Across AI, SEO and Ads AI Visibility as a Competitive Moat: Why Building It Today Is Worth Double

7.5 Tools 8 deep dives

The prompt framework that turns AI monitoring into comparable data AI Visibility Tracking Tool: I Tested Peec, Otterly and Profound for 3 Months (What Really Changes) Google Search Console already tells you whether you appear in AI Overviews (and almost no one looks) Manual Tests on ChatGPT, Claude and Perplexity: The Invisible Ceiling That Blocks You Mention Mining from AI Answers: Turning Citations into Competitive Intelligence Perplexity Analytics Dashboard: how to measure the AI traffic everyone else ignores Brand24, Mention, Meltwater for AI citation tracking: what actually works today Ask the AI to audit your AI: the self-audit nobody does

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →