Measuring AI visibility

Monthly AI Visibility Scorecard: one page, six numbers, three months of trend

Roberto Serra 25 June 2026·~8 min read

If your sales manager receives twenty AI visibility metrics every month, they stop reading them after the second report — and you lose your main ally for defending the budget. A scorecard that really works is one page, six numbers and a comparison with the previous three months: enough to make decisions, little enough to read in two minutes. The right format turns an ignored report into a tool that gets used every month.

I remember when the first monthly SEO reports became standard between 2010 and 2012. At first they were improvised Excel sheets with Google rankings and a few Analytics charts; then they became the document the marketing manager brought to the meeting with the owner to justify the budget. Today the monthly AI visibility scorecard is at exactly the same point: necessary, but very few people produce one.

Let me explain why it pays to start now, and I’ll give you the template I use with clients — one page, six numbers, quarterly comparison. No baroque dashboards, no twenty KPIs that nobody looks at.

The problem you solve with a monthly report (and not with a live dashboard)

Without a monthly report with numbers, your visibility in AI answers is an opinion. You tell the owner “we’re improving on ChatGPT,” they ask you “by how much?” and silence falls. With six numbers tracked over time, on the other hand, you have a business metric: comparable month over month, defensible in a meeting, comparable with competitors.

In the field of business performance measurement research, the principle has been established for decades: what isn’t measured recurrently with the same methodology can’t be managed. From this it follows, applied to your case, that the monthly frequency is not an aesthetic choice — it’s the minimum frequency for separating noise (one-week fluctuations) from signal (a three-month trend).

And from this it further follows that “real-time” dashboards on AI visibility, today, are almost always counterproductive for an SME: they make you check the thermometer every five minutes instead of treating the patient.

What a monthly AI scorecard is (operational definition)

The monthly scorecard is a one-page document, generated on the same day of every month (e.g. the 5th), that reports six indicators on your presence in the answers of generative AI engines (ChatGPT, Claude, Gemini, Perplexity), with a comparison vs the previous month and a quarterly trend.

One page because the owner reads it in two minutes. Six numbers because beyond seven nobody remembers anything. Same day of the month because only this way do the comparisons make sense.

In the previous articles in this series I explained how to measure the individual dimensions — AI share of voice, citation accuracy, sentiment. The scorecard is the point where those separate measures become a decision-making document.

Common mistake

If this month you add five new queries and remove three, the delta vs last month means nothing.

The six numbers I put in the scorecard (and why only these)

The six KPIs I use are the bare minimum to give the owner an actionable view:

AI Share of Voice: on a fixed set of 30-50 queries in your sector, in how many you appear (at least one citation) across the four main AI engines, as a percentage.
Coverage rate: across those four engines (ChatGPT, Claude, Gemini, Perplexity), on how many you appear at least once a month, as a percentage (4/4 = 100%).
Citation accuracy: when the AI cites you, how often the facts reported about your brand are correct (name, location, products, year founded), on a 0-100% scale.
Sentiment score: tone of the AI mentions about your brand on a -1/+1 scale, flagging any slip below zero.
Referral traffic from AI: sessions arriving from chat.openai.com, perplexity.ai, gemini.google.com, claude.ai (Google Analytics 4 with a referral source filter).
Competitor comparison: the same AI Share of Voice calculated for the three most relevant direct competitors, in a side-by-side table.

Everything else (average citation length, position in the answer, query type) belongs in the internal technical report, not in the scorecard. Whoever signs off on the budget doesn’t look at it, and you risk hiding the signal inside the noise.

Pro tip

Define together with whoever does the selling the 30 real queries a customer would ask ChatGPT to reach a supplier like you.

Case study: the scorecard I built for a ceramic slab manufacturer in Sassuolo (MO)

At the start of January 2026 I began keeping a monthly scorecard for a manufacturer of porcelain stoneware slabs in the Sassuolo ceramic district, in the province of Modena. Revenue around 28 million, selling to architecture firms, general contractors and building-materials resellers in Italy, Germany and France.

The initial problem was classic for the district: investments in trade fairs (Cersaie, Coverings), a carefully produced print catalog, a site translated into five languages, but nobody knew whether, when an architect asked ChatGPT “best Italian manufacturers of large-format ceramic slabs for ventilated facades”, the company showed up or not.

Month 1 (January) — baseline:

AI Share of Voice: 12% (3 queries out of 25)
Coverage rate: 25% (1 engine out of 4 — only Perplexity)
Citation accuracy: 60% (recurring error: wrong year founded)
Sentiment: +0.2 (neutral leaning positive)
AI referral traffic: 18 sessions
Competitors: the district’s three main players were respectively at 48%, 36%, 28% AI Share of Voice

The number that struck the owner: the first competitor appeared four times more often than they did. It wasn’t an opinion, it was a row in a table.

Month 3 (March) — after work on the Wikidata entry, correcting the year founded across all professional profiles, rewriting the “about us” page with an inverted pyramid, one product sheet published in FAQ format:

AI Share of Voice: 28%
Coverage rate: 75% (3 engines out of 4)
Citation accuracy: 95%
Sentiment: +0.4
AI referral traffic: 87 sessions
Competitors: 50% / 40% / 30%

The most relevant delta for the owner wasn’t the AI Share of Voice, but the referral traffic: going from 18 to 87 monthly sessions from AI engines made the whole thing concrete in three minutes of meeting.

Honest limitation of the case: a single client, three months, a vertical district. It’s not a study. It’s an indicative pattern I see repeating across a dozen manufacturing companies followed over the last six months: the monthly scorecard shifts the internal conversation from “should we do something with AI?” to “which of the six numbers do we want to move over the next three months?”.

The template you can build in 90 minutes

You don’t need a paid tool to get started. The scorecard for the first 6-12 weeks you build with four free components:

Google Sheets spreadsheet with six rows (the six KPIs) and twelve columns (the months of the year).
Set of 30 queries in your sector, frozen and never modified (if you change the queries month over month, the comparisons are worthless).
Google Analytics 4 filtered on referral sources chat.openai.com, perplexity.ai, gemini.google.com, claude.ai for the traffic figure (https://search.google.com/search-console helps you on the organic side, but the AI referral lives in GA4).
Competitor sheet: the same 30 queries re-run searching for the three competitors, with a manual count.

Binary thresholds I use to read the scorecard without ambiguity:

AI Share of Voice below 20%: you’re invisible, priority is entity foundation.
Coverage rate below 2 engines out of 4: you have a cross-platform authority problem, not a content one.
Citation accuracy below 80%: the problem is the knowledge graph, not SEO.

The real analysis, when you reach the sixth month and want to understand why one KPI moves and another doesn’t, requires professional tools and a wider query set. The free template is the first step, not the destination.

The mistakes I most often spot in homemade scorecards

Changing the queries every month. If this month you add five new queries and remove three, the delta vs last month means nothing. The 30 queries get frozen for a minimum of twelve months.
Measuring a single AI engine. Only ChatGPT, or only Perplexity. The coverage rate exists precisely for this: the spread across four engines is information, not a detail.
Putting in twenty KPIs to “be thorough”. Whoever should read it stops at the second row and the scorecard becomes a PDF nobody opens. Six numbers, one page.
Not comparing with competitors. Your 28% AI Share of Voice says nothing until you know the leader in your sector is at 50% and the third is at 30%.

What to do in the next 30 days

Define together with whoever does the selling the 30 real queries a customer would ask ChatGPT to reach a supplier like you.
Open a Google Sheet with six rows (the KPIs) and twelve columns (the months).
Run the 30 queries on the four AI engines today and populate the “month 1” column.
Identify the three direct competitors and repeat the same 30 queries for them.
Set the fixed day of the month on which you’ll redo the exercise (e.g. the 5th).

Three months of discipline and you’ll have your first quarterly trend that’s defensible in a meeting with whoever signs off on the budget.

From the scorecard to the AI reporting system

The monthly scorecard is the single document that translates all the work on visibility in AI answers into a language understandable by whoever signs off on the budget. Without it, every intervention (Wikidata, FAQ, inverted pyramid, author entity recognition, work on the Google Knowledge Graph) remains an act of faith.

In the previous articles in this series I explained how to structure content so the AI understands it — see inverted pyramid and E-E-A-T for AI. In the next articles in this series on measurement we’ll get into the operational detail of the six KPIs: how to calculate AI Share of Voice on a query set, how to automate citation accuracy monitoring, and how to build the competitor dashboard without paid tools.