Entities and Knowledge Graph

When AI finds three different versions of your company, it stops recommending you

On Wikidata your headquarters is in Milan, on Google Business it's in Monza, on Crunchbase it's in Sesto San Giovanni: for the AI these aren't inaccuracies, they're contradictions that cast doubt on everything it knows about you. When it has to choose between you and a competitor whose data matches everywhere, the model chooses the competitor — every time, silently. A check across five main sources, done once, puts you in the position of the one who gets chosen instead of the one who gets discarded.

An industrial buyer asks ChatGPT: “who manufactures low-voltage electrical panels in Lombardy with ISO certification for the automotive sector?”. The model compares the sources, and among them is TecnoImpianti Soluzioni Industriali. But on Wikidata the headquarters is in Milan, on Google Business Profile it’s in Monza, on Crunchbase it’s in Sesto San Giovanni. The founder on LinkedIn is “Marco Bianchi”, in the site’s Organization schema he’s “M. Bianchi Rossi”, in an industry interview he’s “Marco Bianchi-Rossi”.

Result: the model finds contradictions, and when in doubt it prefers to recommend a competitor whose data matches everywhere. This is how knowledge graph reconciliation works: the process by which AI models compare information about the same entity coming from different sources, and when they find conflicts, confidence collapses.

In this article I explain why these discrepancies cost you visibility in AI answers, what to check in 20 minutes, and I walk you through a real case where we measured before/after.

What an AI model sees when it looks up your company

In the world of academic research, the Knowledge Graph is defined as the backbone of structured knowledge representation. Lucas Jarnac and colleagues state it explicitly in their 2024 survey on managing uncertainty in KG construction.

According to Lucas Jarnac et al., 2024: the Knowledge Graph is the load-bearing architecture with which AI organizes what it knows about the world.

Your company, if it exists in AI systems, exists as a node within one or more of these graphs.

The operational consequence: if your node has inconsistent attributes from one source to another, the model has to choose who to believe. And when it has no clear criterion, it often chooses not to cite you.

Why conflicts lower the model’s confidence

This is where the second piece of the puzzle comes in. Xu and colleagues, in their survey on knowledge conflicts for LLMs, are very clear about what happens inside the model when it finds contradictory information.

“This inconsistency is primarily attributed to the suboptimal quality of datasets utilized during the pre-training phase.”

Xu et al., 2024

Inconsistency, even after fine-tuning, remains an unsolved problem. It follows that if three sources say three different things about your headquarters, the model doesn’t invent the truth: it lowers its confidence in the entity.

Lucas Jarnac et al. detail the type of conflicts that do the most damage.

“On the other hand, Invalidity, Ambiguity, and Timeliness deltas lead to contradictory knowledge, where some parts of the knowledge are necessarily false.”

Lucas Jarnac et al., 2024

Three categories of problems: invalid data (wrong founder), ambiguous data (name with variants), outdated data (an old headquarters still circulating). In all three cases the system concludes that “some part of the knowledge is necessarily false”.

The model doesn’t know which part is false, it only knows that something doesn’t add up. And when in doubt, a B2B AI answer cites the entity with consistent data, not the conflicting one.

Common mistake

The company relocated 3 years ago, Google Business is up to date, but the site’s schema still has the old address — often in a footer template that nobody touches.

The connection with what I’ve already explained

In the earlier articles of the series I told you how AI models represent meaning with embeddings in vector spaces and how they build a brand’s authority with consistent signals. If you want a refresher, read how vector embeddings work and how to apply E-E-A-T for AI.

Knowledge Graph reconciliation sits upstream of everything. If the AI can’t decide who you are — one headquarters? two? three? — no downstream optimization on embeddings or content can make up for it.

Pro tip

Open Google Business Profile, verify that the address, primary category and name match exactly with the site’s schema.

The test you can run in 20 minutes

Here’s a concrete procedure, with tools that actually exist.

Step 1: Wikidata. Go to Wikidata, search for the exact name of your company. If an entry exists, open it and note: registered headquarters, sector, founder, founding date, official website. If no entry exists, note that — it will come in handy later.

Step 2: Your site’s Organization schema. Open Google’s Rich Results Test, paste your homepage URL, and look in the result for the “Organization” block. Note: legalName, address, founder, sameAs (the external links you declare).

Step 3: Google Business Profile. Go to Google Business Profile, open your listing. Note: address, primary category, legal name.

Step 4: cross-comparison. Put the three sets of data into a table. Look at each field. If a field doesn’t match exactly across the three sources, it’s a discrepancy to correct at the right source.

Binary decision threshold: either the critical data (headquarters, legal name, primary sector) match 100%, or you have a problem. There’s no middle ground.

The case I followed: metal components in Veneto

A client in Veneto manufacturing, in the metal components sector, had its headquarters registered in two contradictory versions: Treviso on the website and on Google Business Profile, Padua on Crunchbase and in some industry press articles. On Wikidata there was no entry. When I asked ChatGPT “where is [company name] headquartered”, the model sometimes answered Treviso, sometimes Padua, sometimes “I don’t have reliable information”. Same pattern on Claude and Perplexity.

We worked for three months on three fronts:

  • Crunchbase correction: opening a ticket with the company account, updating the listing, uploading the chamber of commerce registration as proof.
  • Outreach to industry publications: emails to the editorial teams of the three vertical portals that had published the wrong headquarters, with a request for correction and a link to the “about us” page.
  • Creating a Wikidata entry: an entry with verified headquarters, a link to the chamber of commerce registration as the source, and properties P17 (country), P131 (location), P1448 (official name) all consistent.

After another three months, repeating the same queries, both ChatGPT and Claude answered Treviso with high confidence, with no more disclaimers about conflicting sources. Perplexity explicitly cited Wikidata and the official website as concordant sources.

The lesson is that AI models don’t “forgive” contradictions: they wait for the sources to align. As long as they don’t align, your company is treated as a low-reliability entity, and for a B2B buyer that means not being cited.

Honest limitations: it’s a single case, not a controlled study. The pattern, however, is consistent with the academic research cited above. A real analysis of your specific case requires professional tools and a more structured entity audit.

The mistakes I see most often

Four recurring patterns after dozens of cases.

Old headquarters never updated. The company relocated 3 years ago, Google Business is up to date, but the site’s schema still has the old address — often in a footer template that nobody touches.

Wobbly legal name. Automeccanica Brescia, Automeccanica Brescia S.r.l., Auto Meccanica Brescia. To a human it’s the same thing. To the KG they’re different candidate entities to reconcile.

Inconsistent category. On Google Business you’re “Industrial supplier”, on LinkedIn “Manufacturing”, on Crunchbase “Industrial Equipment”. The AI can’t figure out which sector to position you in.

Missing sameAs. In your Organization schema you don’t declare the links to LinkedIn, Wikidata, Crunchbase. The model has to guess that the entities are the same, and often chooses not to unify them.

Why the history of the data matters

One last thing that comes from the research. When systems reconcile conflicts, tracking the origin and uncertainty of each piece of data helps to work better in the future too.

“If these different levels of uncertainty are used for reconciliation, they need to be preserved and represented in the KG as metadata in order to retain a history and could possibly be useful for resolving future conflicts.”

Lucas Jarnac et al., 2024

Translated: the levels of uncertainty are preserved as metadata to help resolve future conflicts. From this it follows that the first “wrong” piece of data that ends up in a public KG leaves a trace: even when you correct it, the system remembers that that source was inconsistent.

The consequence: correcting early costs less than correcting late. In the Veneto case, the three months of corrections were followed by another three months of waiting before the models registered the consistency. Six months in total to win back reliable answers.

What to do concretely

A minimal, quarterly work plan that you can delegate:

  • Open the Rich Results Test, verify the Organization schema. If it’s missing or incomplete, have legalName, address, founder, sameAs added, with the links to LinkedIn, Wikidata, Crunchbase.
  • Open Google Business Profile, verify that the address, primary category and name match exactly with the site’s schema.
  • Open Wikidata. If you don’t exist, consider creating the entry (with third-party, non-self-referential sources). If you exist, verify that the data is up to date.
  • Compare with the 3-5 competitors the AI cites when you run thematic queries in your sector: see who has consistent data and who doesn’t.
  • Repeat every 3 months. Data degrades: a new headquarters, a new manager, a change of company name all reopen discrepancies.

This is an honest first step. It’s not a substitute for a complete entity audit, which requires specialized crawlers and professional tools. It’s the bare minimum to avoid self-excluding yourself from AI answers over a problem you can solve in an afternoon.

Why this concerns your visibility in AI answers

Knowledge Graph reconciliation isn’t a magic factor. Fixing this data isn’t enough to become ChatGPT’s top citation: you need content, authority, consistent signals. But it’s the foundation. If your company’s base data is inconsistent, all the downstream work — content, backlinks, advanced schema — has far less leverage.

Visibility in AI answers is built on clear identities. When your node is a unique, consistent, well-connected entity, the AI recognizes you and has reasons to cite you. When it’s a confused node, it avoids you.

In the next articles of the series we dig into how to build entity-to-entity relationships with your partners, how to handle disambiguation between same-name brands, and how to leverage Wikidata to consolidate identity. If you want to start from the basics, re-read how Named Entity Recognition works and how AI recognizes authors as entities.

Chapter 4 · Entities and Knowledge Graph

Continue with the deep dives

40 deep dives across the 5 sections of the chapter.

4.1 Entity Monitoring & Maintenance 8 deep dives
4.2 Entity Recognition 8 deep dives
4.3 Entity Relationships 8 deep dives
4.4 Knowledge Graph Optimization 8 deep dives
4.5 Vertical & Local Entities 8 deep dives
The author
Roberto Serra at the Senate of the Republic Senate of the Republic · Palazzo Giustiniani Conference “The power of artificial intelligence”
Roberto Serra Roberto Serra

SEO consultant for over 15 years, founder of the Serra SEO Agency (RAANK). He helps multinationals and SMEs stay visible where search is moving: ChatGPT, Perplexity, Gemini and Google's AI Overviews.

As featured in
ANSA Il Sole 24 Ore Le Iene Università di Cagliari La Repubblica
How visible is your brand to AI? Analyze your brand