You're among the first to comment on news in your industry, yet ChatGPT always cites the same names — not yours. It's not a matter of speed: you can get there first, but if you do it on the wrong channel, the model doesn't even see you. Every brilliant comment published in the wrong place is visibility handed to the competitors who have figured out where AI fishes for its sources. Commenting the right way, in the right windows, can get you into generative answers in real time.
News breaks in your industry. You have the perfect comment, you publish it fast. 48 hours later, ChatGPT cites others. Why your speed isn’t enough — and what actually matters.
This is the scenario that entrepreneurs and communication managers at Italian SMEs often describe to me. The instinctive reaction is to blame my website, blame the SEO, blame the press office. In reality the problem is more subtle: the RAG systems that power ChatGPT, Perplexity, Gemini and Claude don’t treat all fresh sources the same way. They select. And they select with a specific bias you can exploit — if you understand how it works.
Let me explain what I found by reverse-engineering the trending topics of a specific sector, and how a commercial shipyard in Taranto can build a newsjacking system that actually ends up in live AI answers, not just in the LinkedIn feed.
What newsjacking means for an AI engine
When you think of classic newsjacking, you think of a quick comment on breaking news that earns you a mention in an industry publication. For AI the mechanism is similar, but with one extra layer: the RAG system crawls authoritative outlets in real time, ingests the content, and when a user asks a related question in the following 24-72 hours, it draws on those sources to build the answer.
Nothing new so far. The counterintuitive point is something else. In the research on LLM citation bias, Andres Algaba and colleagues documented a systematic distortion:
“Most prominently, we observe a significant citation bias average overlap between generated sets is 17%. in the existing generated references, which have a median citation count of 1,326 higher than ground truth references”
Translated: when an LLM chooses which sources to cite, it systematically tends to prefer those already heavily cited elsewhere. The median citation count of the sources the model picks is 1,326 higher than that of “objectively relevant” sources.
The operational consequence for anyone doing newsjacking is heavy. The winner isn’t whoever is fastest in absolute terms. The winner is whoever manages to get cited on sources the AI already recognizes as authoritative. A brilliant comment on a niche technical blog, published 10 minutes after the news, weighs less than a standard comment picked up by Il Sole 24 Ore three hours later.
The “Matthew effect” applied to AI answers
Andres Algaba et al. reference a phenomenon well known in bibliometrics:
“Moreover, a well-documented phenomenon is the “Matthew effect,” where highly cited papers tend to accumulate even more citations”.
In plain terms: those who already have citations accumulate even more. And the bias is even more pronounced in LLMs than in humans.
It follows that your newsjacking only makes sense if it goes through a channel the AI already considers “heavyweight.” It’s the same principle I explained when I wrote about the backlink as a citation proxy and the weight of implicit citations: the system doesn’t judge your content in isolation, it judges it through the network of those who already cite you.
For a commercial shipyard in Taranto that wants to surface in AI answers when the topic is decarbonization of maritime transport, ship refitting or dual-fuel vessels, this changes the priority. You don’t need a perfect blog. You need to be the voice that Lloyd’s List Italia, ShipMag, Informare or Il Secolo XIX call when the news breaks.
You send an impersonal text, with no voice, no name at the bottom.
The reverse-engineering test you can run in 20 minutes
This is the test I always run before defining a client’s newsjacking strategy. It’s not science, it’s an entry-level check: real analysis requires professional media monitoring tools. But it gives you a map in half an hour.
Take Perplexity and run three searches on the trending topics in your industry over the past two weeks. For commercial shipbuilding in Taranto they might be:
- “Mediterranean freight fleet modernization 2026”
- “dual-fuel LNG ammonia vessels refitting costs”
- “Italian shipyard orders decarbonization”
For each query, don’t look at the text of the answer. Look at the sources cited on the side. For each source, note:
- Outlet: is it general-interest, industry-specific, institutional, a blog?
- Type of content: short news item, analysis, interview, press release picked up?
- Presence of external voices: does the piece cite experts, shipyard CEOs, analysts? With full name?
If you do this work on 3 queries x 5-8 sources per query, you get 15-25 sources. The pattern emerges immediately: the sources Perplexity uses most often are those that cite a human voice with name, role and company. Not anonymous press releases.
Pre-written comment templates for 3-4 recurring scenarios (new order in the sector, new regulation, incident/critical issue, technological innovation).
The test I ran myself
I replicated this reverse engineering on three trending topics in Italian commercial shipbuilding during the first two weeks of April 2026. Small sample, indicative test, not an academic study: 3 queries on Perplexity, 22 sources cited in total.
Of these 22 sources:
- 14 were industry outlets (ShipMag, Informare, Port News, The Medi Telegraph) or authoritative general-interest ones (Il Sole 24 Ore, Il Secolo XIX)
- 5 were institutional sites (Confitarma, Assonave, category port authorities)
- 3 were corporate blogs of large players (Fincantieri, Palumbo Group)
- 17 of the 22 sources contained at least one direct quote with first name + last name + company
The figure that interests me most is the last one. The sources the AI selects aren’t the longest, aren’t the most technical. They’re the ones where a journalist called an operator, asked two questions, got a sentence in quotation marks. And that sentence is often the fragment that ends up recombined into the AI answer.
For a Taranto shipyard specialized in commercial ship refitting this means one thing only: your visibility in AI answers depends on having a recognizable spokesperson who answers journalists’ calls within a few hours of the news.
The mistakes I see most often in “AI-oriented” newsjacking
The press-release comment. You send an impersonal text, with no voice, no name at the bottom. The journalist uses it as background, doesn’t cite you. The AI has no material to recombine.
Speed without relevance. A generically “maritime” piece of news breaks and you comment even if it’s outside your scope. Journalists file you away as noise. Next time news that’s truly yours breaks, they don’t call you.
The expert who doesn’t exist as an entity. Your technical director answers journalists, but has no author page on the site, doesn’t appear on Wikidata, has no curated LinkedIn profile. AI engines don’t recognize him as a recurring voice and in the answers the reference is lost. If you want to dig into the mechanism, I covered it in the article on author entity recognition.
The data-free press release. “We are satisfied with the results” never gets recombined. A concrete number — “we delivered 7 ferry refittings in 2025, with an average 22% emissions reduction on the units treated” — does. The AI looks for factual strings, not adjectives.
What to do concretely this week
A newsjacking system designed to end up in AI answers requires three operational pieces, none of which is technological:
- A list of 8-12 trending topics in your industry that you monitor continuously, with Google Trends as a free baseline and a press review of selected industry outlets. For commercial shipbuilding that means: decarbonization, alternative fuels, MIT/Defense orders, port geopolitics, maritime ESG.
- An identified technical spokesperson with a structured author page on the site, a well-maintained company Google Business Profile, presence on Wikidata if there’s a documentable history. It must be the same person cited for 12-18 consecutive months: the AI needs recurrence to recognize them as an entity.
- Pre-written comment templates for 3-4 recurring scenarios (new order in the sector, new regulation, incident/critical issue, technological innovation). Not finished templates, skeleton templates: in 30 minutes you adapt them to the day’s event and send them out. Speed matters, but only if the quality has already been prepared in advance.
On this point, Abolfazl Ansari et al. (2025) documented how more and more outlets are integrating language models into their writing pipelines.
From this you can infer that your comment isn’t only speaking to the human journalist: it’s speaking to a hybrid human-machine system that will then be re-read by another system (the AI’s RAG) at answer time.
Both reward the same thing: quotable sentences, precise data, clear attribution.