When someone asks Alexa, Google Home or Siri for advice in your industry, the assistant reads one answer out loud and names a single name — just one name. If that name isn't yours, you haven't lost a spot in a list: you're simply absent. Voice search works differently from text search and requires specific optimizations, but those who do them today build a position that is hard for latecomers to recover.
You’re measuring your SEO with Search Console but you don’t know how often ChatGPT mentions you. Measuring AI visibility requires dedicated tools that differ from traditional SEO, and when you move to voice — Alexa, Google Home, Siri — the problem doubles: not only do you not know whether they cite you, but when they do, you’re the only voice that’s heard.
Let me explain it with an example that stuck with me. A Baroque boutique hotel in Noto, right in the heart of the Val di Noto, asked me to analyze its AI visibility. On Google it had a decent ranking for “boutique hotel Noto,” good organic traffic, reviews above 9. When I asked Google Assistant “find a charming hotel in the Val di Noto,” the assistant read out the name of a competitor in Modica. A single answer. No list. No second place. For that query, the Noto hotel basically doesn’t exist.
In previous articles in this series I talked about the differences between ChatGPT, Perplexity, Gemini and Bing Copilot. Voice is a chapter of its own, because it changes the rules of the game: from “you appear among the top 3-5 sources” to “it’s you or it’s nobody.”
What changes when the query goes voice
On text-based AI engines, the answer is a synthesis that cites 3-8 sources. You can be source number 4 and still take home traffic, authority and — above all, for the thread I keep throughout the whole series — visibility in AI answers as a cited brand.
On voice assistants this logic breaks down. Alexa, Google Home and Siri speak a single answer. Not a list. Not “here are the top three results”: one sentence, two at most, and that’s it. The reason is simple and human: nobody enjoys listening to 5 results read out loud. The device is designed to give the answer, not to offer a choice.
This leads to a consequence that changes the strategy: the voice query is winner takes all. If you’re the voice that speaks, you hold a monopoly on the query. If you’re not, you’re zero. There’s no runner-up, there’s no “at least mentioned”: there’s the answer, and there’s silence.
It’s not a magical or revolutionary mechanism, and this is an important point: voice won’t replace text, it simply rewards more vertically those who structure their content well. In the world of optimizing for text-based AI you can be “one of” and survive. On voice you’re either “the,” or you disappear.
Why voice sits downstream of all the AI work
Voice isn’t an island. It draws from the same indexes and signals that feed the text-based assistants: knowledge graph, recognized entities, schema markup, author authority.
If you haven’t sorted out the basics — your entry in Google’s Knowledge Graph, the recognition of your brand as an entity through NER, E-E-A-T applied to AI — Google Assistant won’t have structured material to read and will fall back on whoever did that work. That’s why optimizing for voice without first sorting out entities and authority is like putting up the roof without the foundations.
On voice there’s also one more factor at play: brevity. An ideal voice answer fits in 1-2 sentences, under 30 seconds of reading aloud. If your most relevant page for an assistant query has only long, convoluted paragraphs, the assistant won’t find readable material and goes elsewhere. The same inverted-pyramid logic applies: blunt answer at the start, details afterward.
If the answer to the question “what are the museum’s hours” is in the third paragraph after the history of the building, you lose.
The test you can run in 15 minutes
You don’t need a technical setup to figure out whether you have a voice problem. You need a smartphone and 15 minutes.
Take 8-10 queries that a realistic customer would ask out loud in your industry. For the boutique hotel in Noto they were things like “find me a charming hotel in the Val di Noto,” “best boutique hotel in central Noto,” “where to stay near the cathedral in Noto.” For another business they’ll be different ones, but the principle is the same: think about how the customer speaks, not how they type on Google.
Then:
- Open Google Assistant on your smartphone and say the query out loud. Note down the first thing it says.
- Repeat on Siri.
- Repeat on Alexa, if you have a device.
- For each query, note: did it give a brand name? Was it yours? Was it a competitor? Or did it just read “here’s what I found on the web”?
Binary threshold: if across 10 queries in your industry your brand is never spoken by name, you have a voice visibility problem to tackle. If it’s spoken 1-2 times, you’re above zero but below the relevance threshold. From 3-4 up, you’re dominating.
It’s an entry-level check — real analysis requires professional tools and a much larger sample — but it gives you an immediate feel for the problem.
Add an FAQ section with questions in the form a customer would ask them out loud (“where is it located,” “what are the hours,” “how much does it cost,” “how do I book”).
The tool test I ran on AI rank trackers
To measure voice visibility systematically you need dedicated tools: classic SEO with Search Console and traditional ranking checkers sees nothing of what happens inside an assistant. Over the last 18 months several rank trackers built for AI answers have emerged, some with coverage of voice queries too.
I tested 5 of them in parallel, across 10 Italian brands in different sectors (tourism, food, manufacturing, professional firms, e-commerce). I ran the same 30 queries on each tool and compared the results with what the physical device told me — Google Home and the Assistant app — by saying the queries out loud myself.
The accuracy pattern I observed:
- The tools that claim to track ChatGPT, Perplexity and Gemini have high accuracy on text: in 8 cases out of 10 the brand cited by the tool matched what I saw in the interface.
- On real voice — the physical device — accuracy collapsed. Across 30 queries, on average 4-5 tools gave a “winning” brand different from the one the voice assistant actually said.
- None of the 5 tools genuinely simulated the voice pipeline: they extrapolated from the text APIs and guessed what the assistant would read.
An indicative test, not a scientific study: 10 brands, 30 queries, 5 tools, a week of work. The pattern, though, is fairly clear: today AI rank trackers are good for text, weak for voice. For voice you still need the human test on the device.
The mistakes I see most often on voice queries
Answers buried at the bottom of the page. The assistant reads the first lines of the relevant content. If the answer to the question “what are the museum’s hours” is in the third paragraph after the history of the building, you lose.
Long, baroque sentences. A historic pastry shop in Catania had magnificent descriptions of its products, 40-word sentences. The assistant wasn’t reading them: too long for the voice format. Rewritten in short sentences (12-15 words), it started showing up.
Schema markup absent or only partial. Without structured markers — hours, address, business type — the assistant doesn’t know what to pull. Google’s Rich Results Test tells you in 30 seconds whether your homepage has the Organization and LocalBusiness schema. If you don’t have them, you’re invisible to the local voice query.
Google Business Profile left half-done. For local queries (“hotel in Noto,” “restaurant open now”) the assistant draws heavily from Google Business Profile. If the listing doesn’t have a precise category, a structured description, FAQs, you start out with two broken legs.
What to do concretely
- Rewrite the “key info” section of your main pages in 1-2 line sentences, readable in 15-20 seconds out loud.
- Add an FAQ section with questions in the form a customer would ask them out loud (“where is it located,” “what are the hours,” “how much does it cost,” “how do I book”).
- Check and complete your schema markup with the Rich Results Test. For a hotel you need Organization + LodgingBusiness; for a professional firm Organization + ProfessionalService; for an e-commerce Organization + Product.
- Sort out the Google Business Profile: primary + secondary category, description, FAQs, recent photos, updated hours.
- Compare yourself with the 3-5 competitors the assistant cites in your industry when you run the queries out loud: what do they have that you don’t?
Where does the thread lead?
Voice is the extreme of AI logic: a single answer, zero compromises. Working to be that answer means working well on everything else — entities, authority, schema, brevity — because voice draws from there. Continuing to measure your positioning only on Google Search means ignoring the part where visibility in AI answers becomes the most vertical and the most definitive.
In the series articles on platforms I’ll talk about how AI assistants work inside browsers, the differences between Bing Copilot and ChatGPT Search, and how to integrate AI rank tracking tools into the monthly workflow.