How to Measure Your Brand's Visibility in AI Search

How to Measure Your Brand’s Visibility in AI Search

Ask a marketing leader how their brand ranks on Google and they can show you a dashboard in seconds. Ask whether ChatGPT recommends them when a buyer asks for options in their category, and most go quiet. That gap is the most important blind spot in marketing right now. Buyers are moving their first question from a search box to an answer engine, and the systems most teams use to measure visibility cannot see inside those answers at all.

This guide is the measurement method. Not tactics for getting cited, which come later, but the prior step almost everyone skips: knowing where you actually stand before you try to move it.

Why AI visibility is hard to measure

Traditional search gives you a fixed results page. Ten blue links, a defined position, a click you can count. An answer engine gives a different response to nearly identical questions, changes its answer based on phrasing, and often cites sources the searcher never clicks. There is no single ranking to track. Visibility becomes a distribution of outcomes across many prompts and several engines, not a number.

That is why the honest first move is to define what you are measuring before you measure it. A vague goal like “show up in AI” cannot be tracked. A specific one like “appear as a recommended option when someone asks ChatGPT for tools in our category” can.

The four metrics that matter

You do not need twenty metrics. You need four, tracked consistently.

Presence rate. Across a fixed set of buyer questions, how often does your brand appear in the answer at all? This is the foundation. If presence is near zero, nothing else matters yet.

Share of voice. When your brand and your competitors are both eligible to appear, how often do you show up relative to them? This is the competitive read, and it is the one executives understand fastest, because it maps to the question their own CEO is asking: why does our competitor show up and we don’t.

Citation and source attribution. When the engine does mention you, is it citing your own pages, or is it describing you through a third party like a review site or a competitor’s comparison page? Being known through someone else’s content is fragile. Being cited from your own is durable.

Sentiment and accuracy. When a model describes you, does it get you right? An answer that mentions you but misstates what you do can be worse than silence. Accuracy is a visibility metric, not a branding afterthought.

How to build a baseline

A baseline is just these four metrics, captured the same way on a regular cadence. Here is the minimum viable version.

Start with a prompt set. Write twenty to forty real buyer questions in your category, in the natural language a person would actually type. Include category questions (“best tools for X”), comparison questions (“X versus Y”), and problem questions (“how do I solve Z”). This set is your instrument, so freeze it. If you change the questions every month, you cannot compare months.

Pick your engines. At minimum, run the set through ChatGPT, Google AI Overviews, Perplexity, and Gemini, because they behave differently and your buyers use different ones. Record the full answer and any cited sources for each.

Score consistently. For each answer, mark whether you appeared, which competitors appeared, whether your own pages were cited, and whether the description was accurate. Roll those up into the four metrics. That snapshot is your baseline.

Then repeat on a fixed cadence, monthly is enough for most brands, and watch the trend. The first run tells you where you stand. The trend tells you whether anything you are doing works.

Doing it by hand versus tooling

You can run a small baseline by hand, and doing it once manually is worth it because you see exactly what the models say about you. It is uncomfortable and clarifying.

At scale, hand-scoring forty prompts across four engines every month becomes real work, and that is where monitoring tools earn their place. The category is young, and the right tool depends on which engines you care about and whether you need historical tracking. The thing to insist on, whether you build or buy, is that the method stays consistent over time. A measurement you cannot compare across months is not a measurement.

What good looks like

Good is not “we appear everywhere.” Good is a presence rate that climbs on your priority questions, a share of voice that gains on named competitors, citations that increasingly point to your own pages rather than third parties, and descriptions that are accurate. Those four moving in the right direction is what real progress looks like, and it is reportable to a board in one slide.

The point of measuring first

You cannot improve what you cannot see, and you cannot prove what you did not measure before you started. A baseline does two jobs at once. It tells you where to focus, and it becomes the evidence that your work moved the number. In a field crowded with confident claims and thin proof, the team that measures first is the team that can actually show results later.

If you want help building that baseline for your brand, that is where we start every engagement. We measure before we promise.