Entities and Knowledge Graph

Wikidata as semantic backbone: the entry that makes your brand exist for AI

Roberto Serra 25 June 2026·~7 min read

The brands that ChatGPT cites with confidence almost all share something: a completed entry on Wikidata, the large archive of structured knowledge that powers Google, Bing and many AI systems. Those who don't have one, or have an empty one, are noise the model prefers to ignore in order not to risk mistakes. It isn't a complex technical matter: six properties filled in the right way are enough to create an entry that models treat as a reliable source. It can be done in less than an hour.

The brands that ChatGPT cites with high confidence, when answering a vertical question, almost all have one thing in common: a Wikidata entry rich in attributes, with completed properties, website, headquarters, founding year, industry, people involved. The brands the AI never cites, on the other hand, often don’t have an entry at all. Or they have one but it’s empty, with three lines and no linked properties.

This is no coincidence. Wikidata is one of the structured sources that feed the Knowledge Graphs of Google, Bing and several modern AI systems. If your entry isn’t there, or is skeletal, you’re playing the AI visibility game without having filled in the registry form.

Let me explain what Wikidata really is for an AI model, why it sits upstream of almost everything I’ve described in this series, and how to create an entry that works.

What Wikidata is for an AI model

In the field of research on the relationship between language models and knowledge graphs, Wikidata holds a specific position. The survey by Cedric Möller et al. (2021) on entity linking over Wikidata describes it as a continuously updated, community-maintained, multilingual knowledge graph. The work by Wu et al. (2023) on the integration between LLMs and knowledge graphs places it among the encyclopedic graphs most used as a source of external knowledge for AI systems.

Be careful though, Wikidata is not Wikipedia: it isn’t a narrative encyclopedia, it’s a structured database. Every entity (you, your brand, a product, a person) has an identifier code (the Q-number) and a series of properties linked to verifiable values. Type of activity, website, headquarters, founder, founding year, industry, awards received, publications.

Translated into practice, when an AI system has to give an answer about a brand, a company, a professional, one of the routes it can take is linking to a known entity in a graph like Wikidata. If the entity isn’t there, the model relies only on what it read scattered across the text during training. Less reliable, less citable, less likely to appear in answers.

Why Wikidata sits upstream of everything else

In the previous articles in this series I explained how AI represents concepts in the vector space of embeddings and how it recognizes an author as an entity through author entity recognition. Wikidata is the layer beneath.

It’s the register where your brand stops being “a string of text that shows up here and there” and becomes an entity with a stable identity. With a code, with verifiable properties, with external links to authoritative sites, with multilingual translations.

The reason why all the work you do downstream (schema markup, E-E-A-T, well-structured content) pays off more when you have a completed Wikidata entry is simple: you give the AI system a disambiguated anchor point. “This brand here” instead of “maybe this brand, maybe another one with a similar name”.

The conclusion for you is straightforward: if your entity is well represented in that kind of graph, you’re usable material for the answer. If it isn’t, you’re noise the model prefers not to cite in order not to get it wrong.

Common mistake

If your entry isn’t there, or is skeletal, you’re playing the AI visibility game without having filled in the registry form.

The test you can run in ten minutes

Go to wikidata.org and search the exact name of your brand. Three possible outcomes:

No result: you don’t exist in the graph. Zero semantic anchoring for the AI.
Result with a bare entry: you’re there, but you only have a name and one or two properties. Little to cite.
Result with a rich entry: you have P31 (instance of), P856 (website), P159 (headquarters), P112 (founder), P571 (founding date), P452 (industry) filled in. You’re citable.

The binary threshold to start from is simple: if you search for your brand and find nothing, or find an entry with fewer than 5 properties filled in, you’re below the minimum threshold. It isn’t the real analysis, which requires professional tools and control of linguistic aliases, but it’s an honest first step.

Then run the counter-check: open ChatGPT or Perplexity and ask “what do you know about [your brand name]?”. If the answer is generic, confused or makes things up, that’s a consistent signal. If it accurately names founding year, headquarters, industry, there’s a good anchor upstream.

Pro tip

Fill in at least these six properties: P31 (specific type, not generic), P856 (official website), P159 (headquarters), P112 (founder), P571 (founding date), P452 (vertical industry).

A case I followed: a coffee roaster in Parma

An artisan coffee roaster in Parma who works with me had a situation typical of the food specialty scene of gastronomic Emilia: a well-made site, recognized product quality, presence in coffee industry magazines, but zero visibility in AI answers to queries like “artisan coffee roasters Emilia-Romagna” or “specialty coffee micro-roasters Italy”. Perplexity always cited the same three or four names, he never showed up.

Check on Wikidata: entry missing. No Q-number. On Google’s Knowledge Graph, same thing, no panel.

The intervention was precise: creating the Wikidata entry with P31 (enterprise), P856 (website), P159 (headquarters in Parma), P112 (founder), P571 (founding year), P452 (industry: coffee roasting), plus external links to reviews from the specialized press and to articles in food magazines. No magic, just careful compilation and verifiable external references.

After about five months, on a sample of 15 queries we test monthly on ChatGPT and Perplexity, the brand went from zero citations to appearing in 6 answers out of 15. An indicative test, not a study: small sample, no control group, and in the meantime we also worked on other fronts (content structure, schema on the site). But the pattern is consistent with what I see with other food specialty clients: the Wikidata entry isn’t enough on its own, it’s not a magic switch, but it moves the needle noticeably when downstream you have a well-built site.

The mistakes I see most often

Among small food producers in gastronomic Emilia, and not only there, these are the recurring patterns:

Entry created and abandoned. An entry with three properties and then no further updates. Wikidata is a living graph: if it doesn’t grow, it ages badly.
Wrong or generic P31. Putting “business” instead of “coffee roasting”, “winery”, “artisan pasta factory”. You lose the vertical classification, which is exactly what triggers the citation on industry queries.
No authoritative external link. An entry without references to third-party sources (press, trade associations, industry databases) is fragile. It can even be flagged for deletion by the community.
Self-referential description. Phrases like “the best artisan coffee roaster in northern Italy” get removed. Wikidata wants neutral facts: “specialty coffee roaster founded in [year] in [city]”.

Compare your entry (or its absence) with the 3-5 competitors the AI cites most often in your industry queries: often the difference is exactly here.

What to do concretely

Search for your brand on wikidata.org. If it doesn’t exist, create the entry.
Fill in at least these six properties: P31 (specific type, not generic), P856 (official website), P159 (headquarters), P112 (founder), P571 (founding date), P452 (vertical industry).
Add external identifiers where you have them: any trade registries, authoritative profiles.
Insert at least 2-3 references to third-party sources (industry press, associations, recognized catalogs).
After 2-3 months, retest the industry queries on ChatGPT and Perplexity and compare with the baseline.

The real analysis, with complete alias mapping, monitoring of missing properties compared to competitors and integration with schema markup on the site, requires professional tools and a more expert hand. This is the first step to start from.

Chapter 4 · Entities and Knowledge Graph

Continue with the deep dives

40 deep dives across the 5 sections of the chapter.

4.1 Entity Monitoring & Maintenance 8 deep dives

Entity Confidence Testing: reading the AI’s language to understand how much it trusts your brand Your brand exists in four parallel versions (and you don’t know it) Entity Decay: Why AI Stops Citing You (and How to Get Back Into Answers) Entity recovery after a reputation crisis: how to clean up your entity in the Knowledge Graph Periodic Entity Audit: Why Your Data Ages Even When You Don’t Notice AI Response Monitoring for Entities: how to find out if AI tells your brand’s story with the right data Knowledge Graph Poisoning Prevention: how to protect your entity from false information that AI takes as true Training Data Lifecycle: why corrections to your site don’t reach the AI right away

4.2 Entity Recognition 8 deep dives

Entity disambiguation: when AI confuses your brand with a namesake Entity salience: why being named once is like never being named at all Your brand shows up in AI answers, but classified as what? Entity Linking: why 50 mentions of your brand are worth zero if the AI doesn’t connect them When the AI stops understanding that “we” means you: the coreference problem Are your brand in Italian and in English the same entity to AI? Probably not New brand invisible in AI answers: how to speed up recognition Named Entity Recognition: how AI decides whether your brand is “someone” or just text

4.3 Entity Relationships 8 deep dives

AI doesn’t cite you in isolation: it cites you within a network of relationships Competitor Entity Graph: why AI always cites the same 4-5 brands in your industry Entity-place association: why Perplexity knows who to recommend in Sardinia (and you maybe don’t) Industry Vertical Classification: the category that makes you visible (or invisible) in AI answers Supply Chain Entity Mapping: how partners tell AI who you are Speaking at events: why every conference is an authority node for AI Alumni & Affiliation Network: your institutional connections in the AI graph Client Portfolio as Entity Network: why your clients are nodes that strengthen you

4.4 Knowledge Graph Optimization 8 deep dives

Entering Google’s Knowledge Graph: why without it you’re just text to Gemini Wikidata as semantic backbone: the entry that makes your brand exist for AI You are here Complete Organization schema: the machine-readable ID card of your brand sameAs: the glue that holds your identity together for AI Your brand’s tax code in the AI ecosystem When AI finds three different versions of your company, it stops recommending you Rich Entity Attributes: why AI cites only “fat” entities in detail Product Entity vs Brand Entity: why AI can know your name without knowing what you sell

4.5 Vertical & Local Entities 8 deep dives

Google Business Profile as the primary entity: why AI looks there before your website NAP Consistency: why AI sends clients to the wrong number Who is the Cagliari dentist according to ChatGPT? The answer depends on your city’s Knowledge Graph Why AI Recommends Generalist Blogs Instead of Your Medical Practice (and How to Turn It Around) Franchises and multi-location: why AI doesn’t add up the authority of your locations Professional Registry Entity: why the professional register is your proof of existence for AI Vertical industry directories: why AI pulls its recommendations from there Product/Service Schema for Transactional Queries

The author

Roberto Serra at the Senate of the Republic

Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”

Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in

Learn more about Roberto Serra →