Ask ChatGPT any question about your industry: Wikipedia shows up three or four times in every answer, because it's the source every AI was trained on more than any other. If your brand isn't cited as a source in at least one relevant Wikipedia entry, you're missing the channel with the single greatest weight of all. You don't get there by adding your own name to an entry — that gets deleted within 48 hours. You get there another way, and a tour operator in Matera did it starting from scratch: the path is replicable.
Open ChatGPT and ask “what are the main attractions in Matera”. Look at the notes under the answer. You’ll see Wikipedia cited three times: the “Sassi di Matera” entry, the “Matera” entry, the “Parco della Murgia Materana” entry. Now ask about a specific tour guide or an experiential operator in Basilicata. Does Wikipedia know you? In most cases no, and that explains why the AI doesn’t mention you.
Let me explain why being cited as a source in a Wikipedia entry relevant to your industry is the heaviest digital PR action you can take for visibility in AI answers, and how an operator in the Sassi managed to build a sustainable entry in line with editorial policies.
What Wikipedia means to an AI model
In the eyes of an LLM, Wikipedia is not a site like the others. It is literally inside the training set. In the world of LLM research, this fact is explicit:
“For instance, Wikipedia pages are among the five datasets used to train GPT-3”
In other words: Wikipedia pages are among the five main datasets used to train GPT-3, and the same logic holds for most of the models that followed. It’s not a source the model “consults”: it’s a source whose structure, facts and entity connections the model absorbed during learning.
The operational consequence for your tourism business: when ChatGPT answers “what to visit in Basilicata”, it is reassembling patterns it learned about Matera from the training corpora, and Wikipedia is the heart of those corpora. If the Wikipedia entry “Sassi di Matera” mentions your guiding service among its sources, you’re inside that core. If it’s not there, you’re out.
Why one Wikipedia citation weighs more than 100 ordinary backlinks
In earlier articles in this series I talked about how links work as citation proxies for AI (I wrote about it in backlinks as citation proxy) and how the implicit weight of mentions counts more than the bare link (implicit reference weight). Wikipedia sits one level above both, for three reasons that stack up.
First: it’s a training source, not just a retrieval source. A backlink from a blog gets read (maybe) during AI crawling. A Wikipedia note was read by the model during training, meaning it became part of how the model thinks about your industry.
Second: Wikipedia feeds Wikidata, and Wikidata feeds the Google Knowledge Graph — a topic I covered in Google Knowledge Graph entry. The chain is Wikipedia → Wikidata → Knowledge Graph → AI citation.
Third: Wikipedia editors are recognizable entities with traceable reputations, so the authority signal is stronger than any bought review or guest-post link.
Self-inserting your brand name into the body of the entry.
The case study: a Sassi tour guide makes it into Wikipedia’s notes
Let me tell you about a concrete, anonymized intervention. An experiential tour operator in Matera, a micro-business with 4 licensed guides, specialized in nighttime tours of the Sassi and routes through the Murgia. AI visibility at zero: across 20 Basilicata tourism queries tested on ChatGPT and Perplexity, not a single mention of the brand, not even as an example of a local operator.
The intervention was not “let’s add the name to Wikipedia” — that doesn’t work, it gets deleted within 48 hours. The intervention was to produce a citable piece of content: a documented mapping of the 155 rock churches of the Parco della Murgia Materana, with GPS coordinates, dating and conservation status, published as a downloadable report on the operator’s website and presented at a local landscape archaeology conference.
Six months later, an (independent) Wikipedia editor used the report as a source, adding a note to the “Parco della Murgia Materana” entry — exactly the dynamic Wikipedia’s editorial policies require: verifiable secondary sources, no direct self-promotion.
The measurement 8 months after the note, across 30 tourism queries tested on ChatGPT and Perplexity (an indicative test, a small sample but a clear pattern): the operator is mentioned in 7 answers out of 30, versus 0 out of 30 before the intervention. It’s not an explosion, it’s a concrete signal that entering Wikipedia’s citation graph moves something.
Identify 3-5 Italian Wikipedia entries relevant to your industry (territorial + thematic) and study the structure of the existing notes
What the research says about the Wikipedia-LLM link
In the research world, the relationship between Wikipedia and AI models is bidirectional — and this has important implications for anyone who wants to do citation insertion in a sustainable way:
“Our findings and simulation results reveal that Wikipedia articles have been affected by LLMs, with an impact of approximately 1% in certain categories.”
In other words: Wikipedia articles have already been influenced by LLMs, with a measurable impact of around 1% in some categories. Wikipedia is becoming more attentive to detecting artificially generated content, so the path to entering it as a source has become narrower, not wider.
The consequence for your digital PR: forget shortcuts. Producing genuinely citable content — original data, primary research, verifiable reports — is the only approach that holds up.
The test you can run in 15 minutes
Before you think about inserting yourself, check how Wikipedia describes your industry and your area today.
Step 1: open Wikidata and search for the central term of your offering (e.g. “Sassi di Matera”, “Parco della Murgia Materana”, “Lucanian cuisine”). Look at which external sources are linked. If you see only public bodies and major outlets, the room for private operators is narrow but not closed.
Step 2: open the Italian Wikipedia entry for your territory/industry and scroll down to the notes. Count how many notes point to private sources (companies, operators, studies). Decision threshold: if fewer than 10% of the notes are private sources, the editorial community is selective and you need to produce academic-quality material to be accepted.
Step 3: ask ChatGPT and Perplexity “best operators for [experience] in [your city]” and look at what they cite. If they cite Wikipedia or institutional sites, you know the AI-Wikipedia loop is active in your industry and the investment makes sense.
The mistakes I see most often
Self-inserting your brand name into the body of the entry. It’s the first thing that gets deleted. Wikipedia is not a directory of operators, it’s an encyclopedia. Your name enters — if it enters at all — in the notes as the source of a fact, never as “among the operators in the sector is Company X”.
Using interns or agencies that “write Wikipedia entries”. Experienced editors spot promotional style within 3 minutes. The backfire is heavy: the entry gets flagged, the editorial history stays traceable, and future attempts start uphill.
Producing a generic report just to “have a source”. If the report doesn’t add verifiable knowledge about an entity Wikipedia covers, it will never be cited. The right question is: “what information gap can I fill with data nobody else has?”.
Expecting results in 3 weeks. The realistic cycle between publishing a citable piece of content and entering Wikipedia is 6-12 months. It’s not a tactical channel, it’s a multi-year strategy.
What to do in practice
- Identify 3-5 Italian Wikipedia entries relevant to your industry (territorial + thematic) and study the structure of the existing notes
- Produce ONE original report with primary data that answers a specific information gap in those entries
- Publish the report with a stable URL, a named author, transparent methodology
- Present the work in a verifiable third-party context: a conference, an industry journal, a public body
- Wait for independent editors to find it — NEVER edit the entry yourself to insert it
- Track AI mentions at 6, 9, 12 months on a fixed set of industry queries
On the point of the author who produces the report: this is the moment to nurture the authorship signal, a theme I explored in depth in author entity recognition. A report signed by an author with no digital footprint is less likely to be accepted as a Wikipedia source than one signed by a professional with a verifiable footprint.
This is a first step: the serious analysis of the Wikipedia editorial space for your industry, the assessment of notability policies and the production of citable material require professional tools and a body of primary research that goes beyond the setup you can do on your own in an afternoon.
Wikipedia as a multiplier of AI visibility
Let me go back to the starting point. Being visible in AI answers means being inside the knowledge graph that the models absorbed in training and consult in retrieval. Wikipedia is the single most heavily weighted node of that graph for Italian. It’s not magic, it’s not a single factor — it works in combination with E-E-A-T, content structure, recognizable entities — but if you have to allocate a digital PR resource over the medium term, producing citable material for Wikipedia has the best effort-to-durability ratio I know of.
In the next articles in this series I’ll explain how to build citations on vertical industry outlets, how to work on primary research reports that become reusable sources, and how to measure the AI impact of a digital PR campaign with controlled query tracking.