Measuring AI visibility

Ask the AI to audit your AI: the self-audit nobody does

Roberto Serra 25 June 2026·~7 min read

Most business owners have never directly asked ChatGPT what it knows about their own company — and those who do often discover wrong, incomplete information, or details about a competitor with a similar name. In thirty seconds you can get a precise list of the gaps that prevent the AI from citing you correctly. It's the fastest starting point there is, and almost nobody does it.

Open ChatGPT and write two prompts in a row. First: “What do you know about brand X?”. Second: “Now assess your degree of certainty (confidence) in that information: where are you certain, where are you guessing, what’s missing for you to recommend it as a credible source?”.

The model answers both. And in the second reply it hands you — for free, in 30 seconds — the list of things your brand is missing to be cited in AI answers. A kind of free audit done by the very system that should be recommending you.

I call it the LLM-based self-audit. It’s a technique I’ve been using with clients for about a year and which, as obvious as it is, I see applied by very few Italian SMEs. In this article I’ll explain how it works, why it works, and where it stops working.

What it means to ask the AI to audit itself

A model like ChatGPT or Claude has no real awareness of what it does or doesn’t know. It does, however, have two things: a probability distribution over the words it generates, and explicit training to recognize when an answer is based on solid sources and when instead it is interpolating.

When you ask it to “assess your confidence”, the model isn’t doing magic: it performs a self-assessment of its own output and produces a qualitative estimate of how solid it is. It’s not an instrumental measurement, it’s a self-assessment. Imperfect, but useful.

In the earlier articles of this series I talked about objective metrics: share of voice in AI answers, citation accuracy, query coverage rate. All things that measure what the AI does with your brand. The self-audit tells you something different and complementary: what the AI thinks it knows about your brand, and where it feels it has gaps.

Why it works (and why it isn’t magic)

The underlying principle is simple. The AI model is trained on billions of pages. If your brand appears with consistent signals — website, Wikipedia, third-party articles, author profiles, schema markup — the model builds a “dense” representation and recommends you with confidence. If instead it appears in a fragmented way or not at all, the model admits uncertainty.

The link is direct with things I covered in other articles: E-E-A-T for AI, author recognition as an entity, the weight of implicit citations. The AI self-audit is your operational mirror of those signals: it tells you what the model sees and what it doesn’t.

It’s not a peer-reviewed study, it’s a first-person diagnostic test. It follows that it should be treated as such: excellent for getting your bearings, insufficient for certification. Real analysis requires professional tools and a baseline across multiple AI engines over time.

Common mistake

If the model says “you’re missing a presence on Wikipedia”, it doesn’t mean you should rush to create a Wikipedia entry.

The test you can run in 10 minutes

Open ChatGPT (or Claude, or Perplexity — better to do it on all three for comparison). Use this prompt sequence on your brand. I’ll take as an example an accordion maker from Castelfidardo, because it’s a sector — that of artisanal musical instruments from the Marche region — where the difference between brands “seen” and “not seen” by the AI is enormous.

Prompt 1 — What you know “What do you know about [Brand Name], an accordion maker in Castelfidardo?”

Prompt 2 — Confidence “On a scale from 1 to 10, how confident are you in the information you gave me? Where did you feel more solid and where were you making plausible but unverified inferences?”

Prompt 3 — What’s missing “If you had to recommend an Italian accordion maker to a professional musician, would [Brand Name] make your shortlist? If not, what is it missing compared to the brands you would recommend instead?”

Prompt 4 — Competitive comparison “Compare it with [Competitor 1], [Competitor 2], [Competitor 3]. In what way are these three more ‘readable’ to you?”

The answers to prompts 3 and 4 are where everything happens. There the model tells you — literally — what you need to fix in order to enter its answers.

Pro tip

Create a Google sheet with columns: date, AI engine, reliability score (confidence), gaps cited, prompt used.

The test I ran myself

I applied this sequence to 12 musical instrument makers from the Marche region: 8 accordion makers from the Castelfidardo area, 4 violin-making workshops (violins and bowed strings) from Marche. An indicative test, not a study: small sample, but the pattern was clear.

Result: 9 brands out of 12 received a score below 5/10 from the model on the confidence (reliability) question. Of these 9, seven received from the AI exactly the same list of gaps: no Wikipedia page, no Wikidata profile, a website without a filled-in Organization schema, content authors not recognizable as entities, zero mentions in English-language industry publications.

The 3 brands above 7/10 had three things in common: a presence on Wikipedia (even a stub), coverage in international music magazines, and a founder or luthier named in indexed interviews. Nothing esoteric — exactly the authority signals I talked about in the articles on Google’s Knowledge Graph and backlinks as a citation proxy.

One thing that struck me: for the 4 violin makers, ChatGPT and Claude returned very different answers from each other. Claude was more cautious, explicitly stating that it had no recent information on 3 of the 4 brands. ChatGPT instead in 2 cases invented plausible but false details (founding years, awards never won). Perplexity, drawing in real time, was the most reliable but also the most “empty” on small brands — it simply couldn’t find them.

Operational lesson: the self-audit gives you value not when the answers are uniform, but when they diverge. The divergences tell you where each AI engine has different gaps — and where you need to act first.

Limit of the test: the model can get its own self-assessment wrong. But when it’s wrong, it’s wrong consistently — and for your operational audit that’s more than fine.

The mistakes I see most often

Running the test only once. AI answers vary from session to session. Repeat the test 3 times on different days and take a qualitative average. A single run isn’t significant.

Testing only on ChatGPT. Each of the AI engines has different training and sources. A brand can be visible to Perplexity (which draws from the web in real time) and invisible to Claude (which has a stricter knowledge cutoff). Go to at least three engines.

Accepting the first answer as gospel. If the model says “you’re missing a presence on Wikipedia”, it doesn’t mean you should rush to create a Wikipedia entry. It means you need to understand why you don’t have equivalent signals that make you readable. The Wikipedia entry is one of the ways, not the only one.

Asking without naming the competitors. The AI reveals far more when it can compare. Prompt 4 in the sequence above is the one that produces the most actionable indications. Skipping it halves the value of the test.

Not keeping the answers. If you don’t save screenshots or copy-paste the answers of each test, in six months you’ll have no way to measure progress. Create a Google sheet with columns: date, AI engine, reliability score (confidence), gaps cited, prompt used. That way the self-audit becomes a time series, not an anecdote.

What to do concretely after the self-audit

Extract from the AI answer the 3 gaps cited most frequently across the three engines
Compare with the 3-5 competitors the model cites first in your sector: what they have that you don’t
For each gap, decide whether it’s structural (it needs an entity building intervention, author recognition, schema) or coverage-related (it needs content)
Repeat the test quarterly with the same prompt sequence: keep track of the reliability score over time

The goal isn’t to get the AI to say “I always recommend you”. The goal is to raise the confidence score from below 5 to above 7 over the next 6-9 months. When you get there, citations in AI answers start to appear on their own.

Where this goes from here

The self-audit is a diagnostic tool, not a continuous metric over time. To truly measure your visibility in AI answers you need objective and repeatable metrics: in the next articles of this series I’ll talk about the monthly AI visibility scorecard, the competitive comparison matrix and how to run an AI audit of competitors to build the baseline to start from.

The self-audit remains the first thing I do when a new client asks me “why do I never show up in AI answers?”. In 10 minutes the AI itself answers the question. It’s an honest and free starting point — then the real work begins.

Chapter 7 · Measuring AI visibility

Continue with the deep dives

40 deep dives across the 5 sections of the chapter.

7.1 Competitive Benchmarking 8 deep dives

Competitor AI Audit: how to reverse engineer your rivals’ AI visibility The competitors winning in AI answers share 3 sources you’re missing Gap Analysis by Query Cluster: the 30 queries that separate you from your competitor are your editorial plan for the next 6 months New entrant detection: how to discover the competitors AI is starting to cite before you do Reverse engineering the competitor the AI cites most: how to turn their pattern into your map Industry Benchmark for AI Visibility: The Number That Gives Meaning to Your Share of Voice Seasonal AI Visibility Pattern: reading the cycles so you don’t mistake seasonality for a problem Bilingual AI visibility: why AI cites you in Italian but you vanish in English (or vice versa)

7.2 KPIs & Metrics 8 deep dives

AI Confidence Indicator: read how much the AI trusts you from the language it uses AI platform visibility: why an aggregate average leads your investment astray AI Referral Traffic: the only AI metric you can already see today in Google Analytics AI Share of Voice: the metric that’s replacing market share AI Mention Sentiment: How the AI Cites You Matters More Than How Often AI Citation Accuracy Rate: How Often AI Tells the Truth About Your Brand AI Recommendation Position: your spot in the AI list is the new ranking Query Coverage Rate: the metric that tells you how often AI really recommends you

7.3 Reporting & Dashboard 8 deep dives

Monthly AI Visibility Scorecard: one page, six numbers, three months of trend Competitive Comparison Matrix: Winning on One AI Platform Isn’t Enough Anymore Which Sources the AI Cites You From: The Map That Tells You Where to Invest Quarterly Trend Analysis: how to tell if your AI visibility is truly growing Hallucination Tracking Report: turning the AI’s mistakes about your brand into data you can manage The format that unlocks the budget for AI visibility The AI visibility report that shifts the conversation with your client AI Alert: set up now the system that warns you when your brand disappears from the answers

7.4 ROI & Business Impact 8 deep dives

AI lead attribution: how to know how many customers really come from ChatGPT and Perplexity When Google Ranking Drops and AI Visibility Rises: the Signal You Must Learn to Read Cost per AI Mention: What It Really Costs You Every Time AI Recommends You What AI Visibility Level Are You At: The Maturity Model Your CEO Understands AI visibility budget: 10k, 30k, 100k a year? Here’s how to actually allocate it AI Visibility Forecasting: predict where your AI visibility will be in 6 months Channel Mix Optimization: How to Rebalance Budget Across AI, SEO and Ads AI Visibility as a Competitive Moat: Why Building It Today Is Worth Double

7.5 Tools 8 deep dives

The prompt framework that turns AI monitoring into comparable data AI Visibility Tracking Tool: I Tested Peec, Otterly and Profound for 3 Months (What Really Changes) Google Search Console already tells you whether you appear in AI Overviews (and almost no one looks) Manual Tests on ChatGPT, Claude and Perplexity: The Invisible Ceiling That Blocks You Mention Mining from AI Answers: Turning Citations into Competitive Intelligence Perplexity Analytics Dashboard: how to measure the AI traffic everyone else ignores Brand24, Mention, Meltwater for AI citation tracking: what actually works today Ask the AI to audit your AI: the self-audit nobody does You are here

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →