Most business owners have never directly asked ChatGPT what it knows about their own company — and those who do often discover wrong, incomplete information, or details about a competitor with a similar name. In thirty seconds you can get a precise list of the gaps that prevent the AI from citing you correctly. It's the fastest starting point there is, and almost nobody does it.
Open ChatGPT and write two prompts in a row. First: “What do you know about brand X?”. Second: “Now assess your degree of certainty (confidence) in that information: where are you certain, where are you guessing, what’s missing for you to recommend it as a credible source?”.
The model answers both. And in the second reply it hands you — for free, in 30 seconds — the list of things your brand is missing to be cited in AI answers. A kind of free audit done by the very system that should be recommending you.
I call it the LLM-based self-audit. It’s a technique I’ve been using with clients for about a year and which, as obvious as it is, I see applied by very few Italian SMEs. In this article I’ll explain how it works, why it works, and where it stops working.
What it means to ask the AI to audit itself
A model like ChatGPT or Claude has no real awareness of what it does or doesn’t know. It does, however, have two things: a probability distribution over the words it generates, and explicit training to recognize when an answer is based on solid sources and when instead it is interpolating.
When you ask it to “assess your confidence”, the model isn’t doing magic: it performs a self-assessment of its own output and produces a qualitative estimate of how solid it is. It’s not an instrumental measurement, it’s a self-assessment. Imperfect, but useful.
In the earlier articles of this series I talked about objective metrics: share of voice in AI answers, citation accuracy, query coverage rate. All things that measure what the AI does with your brand. The self-audit tells you something different and complementary: what the AI thinks it knows about your brand, and where it feels it has gaps.
Why it works (and why it isn’t magic)
The underlying principle is simple. The AI model is trained on billions of pages. If your brand appears with consistent signals — website, Wikipedia, third-party articles, author profiles, schema markup — the model builds a “dense” representation and recommends you with confidence. If instead it appears in a fragmented way or not at all, the model admits uncertainty.
The link is direct with things I covered in other articles: E-E-A-T for AI, author recognition as an entity, the weight of implicit citations. The AI self-audit is your operational mirror of those signals: it tells you what the model sees and what it doesn’t.
It’s not a peer-reviewed study, it’s a first-person diagnostic test. It follows that it should be treated as such: excellent for getting your bearings, insufficient for certification. Real analysis requires professional tools and a baseline across multiple AI engines over time.
If the model says “you’re missing a presence on Wikipedia”, it doesn’t mean you should rush to create a Wikipedia entry.
The test you can run in 10 minutes
Open ChatGPT (or Claude, or Perplexity — better to do it on all three for comparison). Use this prompt sequence on your brand. I’ll take as an example an accordion maker from Castelfidardo, because it’s a sector — that of artisanal musical instruments from the Marche region — where the difference between brands “seen” and “not seen” by the AI is enormous.
Prompt 1 — What you know “What do you know about [Brand Name], an accordion maker in Castelfidardo?”
Prompt 2 — Confidence “On a scale from 1 to 10, how confident are you in the information you gave me? Where did you feel more solid and where were you making plausible but unverified inferences?”
Prompt 3 — What’s missing “If you had to recommend an Italian accordion maker to a professional musician, would [Brand Name] make your shortlist? If not, what is it missing compared to the brands you would recommend instead?”
Prompt 4 — Competitive comparison “Compare it with [Competitor 1], [Competitor 2], [Competitor 3]. In what way are these three more ‘readable’ to you?”
The answers to prompts 3 and 4 are where everything happens. There the model tells you — literally — what you need to fix in order to enter its answers.
Create a Google sheet with columns: date, AI engine, reliability score (confidence), gaps cited, prompt used.
The test I ran myself
I applied this sequence to 12 musical instrument makers from the Marche region: 8 accordion makers from the Castelfidardo area, 4 violin-making workshops (violins and bowed strings) from Marche. An indicative test, not a study: small sample, but the pattern was clear.
Result: 9 brands out of 12 received a score below 5/10 from the model on the confidence (reliability) question. Of these 9, seven received from the AI exactly the same list of gaps: no Wikipedia page, no Wikidata profile, a website without a filled-in Organization schema, content authors not recognizable as entities, zero mentions in English-language industry publications.
The 3 brands above 7/10 had three things in common: a presence on Wikipedia (even a stub), coverage in international music magazines, and a founder or luthier named in indexed interviews. Nothing esoteric — exactly the authority signals I talked about in the articles on Google’s Knowledge Graph and backlinks as a citation proxy.
One thing that struck me: for the 4 violin makers, ChatGPT and Claude returned very different answers from each other. Claude was more cautious, explicitly stating that it had no recent information on 3 of the 4 brands. ChatGPT instead in 2 cases invented plausible but false details (founding years, awards never won). Perplexity, drawing in real time, was the most reliable but also the most “empty” on small brands — it simply couldn’t find them.
Operational lesson: the self-audit gives you value not when the answers are uniform, but when they diverge. The divergences tell you where each AI engine has different gaps — and where you need to act first.
Limit of the test: the model can get its own self-assessment wrong. But when it’s wrong, it’s wrong consistently — and for your operational audit that’s more than fine.
The mistakes I see most often
Running the test only once. AI answers vary from session to session. Repeat the test 3 times on different days and take a qualitative average. A single run isn’t significant.
Testing only on ChatGPT. Each of the AI engines has different training and sources. A brand can be visible to Perplexity (which draws from the web in real time) and invisible to Claude (which has a stricter knowledge cutoff). Go to at least three engines.
Accepting the first answer as gospel. If the model says “you’re missing a presence on Wikipedia”, it doesn’t mean you should rush to create a Wikipedia entry. It means you need to understand why you don’t have equivalent signals that make you readable. The Wikipedia entry is one of the ways, not the only one.
Asking without naming the competitors. The AI reveals far more when it can compare. Prompt 4 in the sequence above is the one that produces the most actionable indications. Skipping it halves the value of the test.
Not keeping the answers. If you don’t save screenshots or copy-paste the answers of each test, in six months you’ll have no way to measure progress. Create a Google sheet with columns: date, AI engine, reliability score (confidence), gaps cited, prompt used. That way the self-audit becomes a time series, not an anecdote.
What to do concretely after the self-audit
- Extract from the AI answer the 3 gaps cited most frequently across the three engines
- Compare with the 3-5 competitors the model cites first in your sector: what they have that you don’t
- For each gap, decide whether it’s structural (it needs an entity building intervention, author recognition, schema) or coverage-related (it needs content)
- Repeat the test quarterly with the same prompt sequence: keep track of the reliability score over time
The goal isn’t to get the AI to say “I always recommend you”. The goal is to raise the confidence score from below 5 to above 7 over the next 6-9 months. When you get there, citations in AI answers start to appear on their own.
Where this goes from here
The self-audit is a diagnostic tool, not a continuous metric over time. To truly measure your visibility in AI answers you need objective and repeatable metrics: in the next articles of this series I’ll talk about the monthly AI visibility scorecard, the competitive comparison matrix and how to run an AI audit of competitors to build the baseline to start from.
The self-audit remains the first thing I do when a new client asks me “why do I never show up in AI answers?”. In 10 minutes the AI itself answers the question. It’s an honest and free starting point — then the real work begins.