Measuring AI visibility

Which Sources the AI Cites You From: The Map That Tells You Where to Invest

Roberto Serra 25 June 2026·~8 min read

If the AI cites you a hundred times a month, it's likely that 80% of those citations come from two or three specific sources — and if you don't know which ones, you're spreading your communication budget across everything without knowing what actually works. Mapping the sources that drive your AI citations turns your strategy from spraying into the crowd to investing where there's already a measurable return.

Of the 150 AI citations collected on your brand over the last 6 months, 3 sites generate 60% of them. Do you know which ones? Mapping the sources is the foundation of everything else.

I’m telling you this because it’s the pattern I see repeating almost boringly every time I open a client’s tracking sheet. The distribution of the sources feeding AI answers about your brand isn’t uniform: a few pages do the bulk of the work, and almost always the brand owner doesn’t even know which ones they are.

In this article I’ll explain how to build this map, why it’s worth every minute you put into it, and what to do with it once you have it in front of you.

What I mean by the source of an AI citation

When Perplexity, ChatGPT with browsing, or Gemini answer a question that concerns you, URLs appear beneath the answer (or linked within it). Those are the grounding sources: the web documents from which the model extracted the information it’s reporting.

The source map is simply the ordered list of those links, collected over an observation period and grouped by domain, author, and content type. It’s a mechanical exercise; it doesn’t require a degree in data science. But it changes everything you do afterward.

Why it’s the prerequisite for any serious measurement

In the previous articles in this series you saw how a citation is counted, how to distinguish a brand mention from a citation with a link, how to track share of voice in AI answers. All of that is downstream. The source map is upstream: it tells you where the signal comes from.

Without the map, you’re counting the points in the championship without knowing which players scored them. You’re measuring an aggregate result you can neither replicate nor defend. With the map, instead, every citation becomes actionable: you know which page generated the signal, and therefore you know where to reinforce.

The thread is the same as always in my articles: you care about showing up in AI answers, and to show up more you need to understand where you’re already showing up from right now.

Common mistake

The fourth is not comparing yourself against the 3-5 competitors the AI cites in your sector.

The underlying principle: the models don’t pick at random

In the world of research on Retrieval-Augmented Generation (RAG) systems, the documented mechanism is clear: AI models select grounding sources based on signals of authority, freshness, semantic alignment with the query, and — increasingly — author entity recognition and implicit reference weight. From this it follows that, for your business, the sources picked up about your brand are not random: they are the subset of your web assets (and of those that talk about you) that the system considers most reliable in that context.

The operational consequence is sharp. If you map the sources, you see which assets the system has already “validated.” Those are your thoroughbreds: they need to be updated, expanded, internally linked. Everything else is maintenance spending.

Pro tip

Double down on your top URLs: the page that generates the most AI citations in your domain is your flagship asset.

The test you can run in 60 minutes

You need a Google Sheet with 5 columns: date, query, AI engine, source URL, source domain.

Procedure:

Open Perplexity (it’s the one that always gives you the links; the others don’t always). Run 15-20 realistic queries about your sector in which you’d expect to be able to appear. Examples for a producer of organic extra virgin olive oil in Umbria: “best organic extra virgin olive oil Umbria”, “cold-press mills Trasimeno”, “EVO oil corporate gift central Italy”.
For every answer that cites you or cites a competitor, copy all the source URLs into the sheet.
Repeat on ChatGPT with browsing enabled and on Gemini, using the same queries.
At the end of the week, group by domain. Count the occurrences.

A binary reading threshold: if your domain appears in less than 30% of the answers concerning your sector, you have an editorial coverage problem, not a technical SEO one. If it appears in 30-60% you’re on the right track and it needs to be defended. Above 60% you’re already an authority recognized by the system in that semantic field of queries.

An honest entry-level caveat: this is a first manual step. The real analysis, on large volumes and with longitudinal tracking, requires professional tools. But the first step is enough to tell you whether you’re investing in the right direction.

The test I ran myself: 6 months on Adriatic seaside hotels

To write this series I kept a longitudinal observation over 6 months on a sample of seaside hotels on the central Adriatic, in particular a 4-star boutique hotel in Fano (PU) and five of its direct competitors between Senigallia, Marotta, and Pesaro.

Every two weeks I ran the same set of 12 queries (such as “best hotel Fano sea view”, “family hotel private beach Marche”, “boutique hotel Adriatic for couples”) on Perplexity, ChatGPT with browsing, and Gemini. I collected all the cited sources and mapped them by domain.

The pattern that emerged, across an overall sample of about 480 answers collected:

3 domains made up 58% of the citations: the Fano hotel’s own site, a regional tourist guide for the Marche area, and an in-depth page from a 2022 national travel magazine.
The sites of tour operators and aggregators appeared often, but were almost never cited as a primary source on the “boutique” or “experience” queries; they came in on the purely transactional queries.
A hotel blog page published in 2023 about excursions in San Bartolo Park was the single most cited URL of all, by itself.

Stated limitations: not a large sample, a single geographic area, only one seasonality fully covered (autumn-winter), AI engines in constant evolution during the test period. An indicative pattern, not a peer-reviewed study. But the signal was clear enough to make me, together with the client, rethink the entire editorial strategy: double down on articles modeled on “San Bartolo excursion” and abandon two thematic threads that didn’t produce a single citation in 6 months.

The mistakes I see most often when I start from scratch with a client

There are four that recur almost predictably.

The first is not mapping at all: you count the total number of citations month over month and stop there. The number grows, we celebrate; the number drops, we worry. Without knowing where those citations come from, you’re working blind.

The second is mapping only your own domain. The citations that concern you often come through third parties: an industry guide, a Wikipedia entry, an old article from a local newspaper. Those absolutely need to be mapped, because they tell you where your authority has been built by others (and where it’s worth strengthening the relationship).

The third is ignoring dated sources. I regularly find articles from 2018-2020 still heavily cited by AI models in 2026. They’re assets of dormant value: updating the content (with the publisher’s consent or with a new article that references it) is one of the highest-ROI operations of all.

The fourth is not comparing yourself against the 3-5 competitors the AI cites in your sector. If on your semantic field of queries the AI always cites three competitors and never you, the map of those three tells you exactly where you’re missing editorial presence.

What to do with the map once you have it

Three concrete actions, in order of priority:

Double down on your top URLs: the page that generates the most AI citations in your domain is your flagship asset. Update it every 2-3 months, extend the content, add FAQs, reinforce the internal links pointing to it.
Reclaim third-party sources: if a guide or a magazine cites you and is well positioned as an AI source, build a stable editorial relationship. A collaboration that produces 2 articles a year on that outlet is worth more than 20 guest posts on sites the AI doesn’t consider.
Audit the pages that are “dead” for AI: your pages that never appear as a source. Decide: do they deserve an editorial relaunch (rewrite, inverted pyramid, recognizable author)? Or are they service pages that are fine as they are? Deciding explicitly is already half the work.

Where this map fits into the rest of your measurement

The source map is the starting point. Once you have it, you can tackle the rest of the measurement work I see on your AI visibility dashboard: tracking share of voice over time, attributing conversions to AI citations, benchmarking against competitors. Without the map, everything else remains a number without a cause. With the map, every metric ties back to a concrete editorial action.

In the following articles in the series I’ll take you further with longitudinal tracking of share of voice, the attribution model I use to link AI citations to qualified inquiries, and the setup of a monthly dashboard readable by a business owner in 5 minutes.

Chapter 7 · Measuring AI visibility

Continue with the deep dives

40 deep dives across the 5 sections of the chapter.

7.1 Competitive Benchmarking 8 deep dives

Competitor AI Audit: how to reverse engineer your rivals’ AI visibility The competitors winning in AI answers share 3 sources you’re missing Gap Analysis by Query Cluster: the 30 queries that separate you from your competitor are your editorial plan for the next 6 months New entrant detection: how to discover the competitors AI is starting to cite before you do Reverse engineering the competitor the AI cites most: how to turn their pattern into your map Industry Benchmark for AI Visibility: The Number That Gives Meaning to Your Share of Voice Seasonal AI Visibility Pattern: reading the cycles so you don’t mistake seasonality for a problem Bilingual AI visibility: why AI cites you in Italian but you vanish in English (or vice versa)

7.2 KPIs & Metrics 8 deep dives

AI Confidence Indicator: read how much the AI trusts you from the language it uses AI platform visibility: why an aggregate average leads your investment astray AI Referral Traffic: the only AI metric you can already see today in Google Analytics AI Share of Voice: the metric that’s replacing market share AI Mention Sentiment: How the AI Cites You Matters More Than How Often AI Citation Accuracy Rate: How Often AI Tells the Truth About Your Brand AI Recommendation Position: your spot in the AI list is the new ranking Query Coverage Rate: the metric that tells you how often AI really recommends you

7.3 Reporting & Dashboard 8 deep dives

Monthly AI Visibility Scorecard: one page, six numbers, three months of trend Competitive Comparison Matrix: Winning on One AI Platform Isn’t Enough Anymore Which Sources the AI Cites You From: The Map That Tells You Where to Invest You are here Quarterly Trend Analysis: how to tell if your AI visibility is truly growing Hallucination Tracking Report: turning the AI’s mistakes about your brand into data you can manage The format that unlocks the budget for AI visibility The AI visibility report that shifts the conversation with your client AI Alert: set up now the system that warns you when your brand disappears from the answers

7.4 ROI & Business Impact 8 deep dives

AI lead attribution: how to know how many customers really come from ChatGPT and Perplexity When Google Ranking Drops and AI Visibility Rises: the Signal You Must Learn to Read Cost per AI Mention: What It Really Costs You Every Time AI Recommends You What AI Visibility Level Are You At: The Maturity Model Your CEO Understands AI visibility budget: 10k, 30k, 100k a year? Here’s how to actually allocate it AI Visibility Forecasting: predict where your AI visibility will be in 6 months Channel Mix Optimization: How to Rebalance Budget Across AI, SEO and Ads AI Visibility as a Competitive Moat: Why Building It Today Is Worth Double

7.5 Tools 8 deep dives

The prompt framework that turns AI monitoring into comparable data AI Visibility Tracking Tool: I Tested Peec, Otterly and Profound for 3 Months (What Really Changes) Google Search Console already tells you whether you appear in AI Overviews (and almost no one looks) Manual Tests on ChatGPT, Claude and Perplexity: The Invisible Ceiling That Blocks You Mention Mining from AI Answers: Turning Citations into Competitive Intelligence Perplexity Analytics Dashboard: how to measure the AI traffic everyone else ignores Brand24, Mention, Meltwater for AI citation tracking: what actually works today Ask the AI to audit your AI: the self-audit nobody does

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →