The AEO agency market is young, crowded, and full of positioning that outpaces capability. Everyone has added AI search to their services page. Very few have built the internal systems to deliver it well, and from the outside the two groups look nearly identical.

You may notice the web is full of “top AEO agencies” rankings that resolve this for you. Most are published by an agency that happens to sit at #1, which is a format worth distrusting on sight, including when the publisher is us. What works better is a scorecard you fill in yourself.

Score each agency you’re considering from 1 to 5 on the five criteria below. The descriptions tell you what the extremes look like, and each criterion comes with the questions that reveal the real score. By the end you’ll have a number, but more importantly you’ll have watched how each agency answers under specific questioning, which tells you most of what the number can’t.


Criterion 1: AI-Nativeness

Was the agency built to operate with AI agents, or did it add AI to an existing workflow? AI-native agencies tend to produce more consistent work at greater scale and lower cost, because the system does the producing and the humans do the governing. The AI-native vs. traditional marketing agency comparison covers this distinction in depth.

What a 1 looks like. “We use ChatGPT to help with drafts.” AI as a writing assistant inside an unchanged process.

What a 5 looks like. Agentic systems run the recurring work end to end, with human review at defined gates. The agency can show you the system, not just describe it.

Questions that reveal the score. Walk me through what your system does between our meetings. What runs without a human touching it, and where do humans review? Can you show it operating?


Criterion 2: Strategy Depth

Real AEO strategy goes beyond keyword targeting. It includes entity optimization, structured data, prompt landscape mapping, identity consistency, and presence across the platforms that feed AI answers. The answer engine optimization complete guide explains what a complete program covers.

What a 1 looks like. An SEO keyword plan with “AI” added to the deck. Strategy starts and ends with content topics.

What a 5 looks like. Strategy starts with who the company is, then maps the prompt landscape, then builds entity, technical, and content layers that all tell the same story. The agency can explain why each layer matters to a retrieval system.

Questions that reveal the score. What do you need to understand about us before producing anything? How do you decide which prompts and questions matter for our category? What does your technical work include beyond content?


Criterion 3: Measurement Capability

Can the agency measure AI visibility, share of voice, and sentiment in AI-generated responses? Without this you’re flying blind, and so are they.

What a 1 looks like. Google Analytics and rank tracking, relabeled. AI visibility reported as anecdotes.

What a 5 looks like. Purpose-built AI monitoring with named tools, a dashboard you can see, and reporting on impressions, share of voice, and sentiment across platforms, tracked continuously.

Questions that reveal the score. Which monitoring tools do you use and can we see a live dashboard? How would you baseline our AI visibility before the engagement starts? What do you do when an AI answer cites us inaccurately?


Criterion 4: Content Quality

AI search rewards depth, accuracy, and authority, and the platforms are getting steadily better at discounting volume plays. Quality enforced at the system level beats quality enforced by a heroic editor.

What a 1 looks like. Volume commitments. “Forty optimized posts per month” with no named reviewer and no consistent voice.

What a 5 looks like. Content that reads like the client wrote it on their best day. A defined identity governs voice, briefs govern structure, and a human signs off on what ships. Ask to read three recent pieces and you enjoy them.

Questions that reveal the score. Who reviews content before it publishes, by name? How do you keep fifty pieces sounding like one company? Can we read recent work you’ve shipped for clients in our category?


Criterion 5: Fit for Your Stage

The best agency in the market is the wrong agency at the wrong stage. An enterprise shop’s process overhead will crush a Series A team, and a lean startup-focused shop will frustrate an enterprise that needs governance.

What a 1 looks like. Minimums, timelines, and stakeholder processes designed for a different company than yours.

What a 5 looks like. Pricing, pace, and communication built for companies at your stage, with references from companies at your stage to prove it.

Questions that reveal the score. What does day one look like? Who is actively on our account? Can you show results from a company our size? What would make us a bad fit for you?


The Scorecard

Criterion Weight Agency A Agency B Agency C
AI-Nativeness x2 if AEO is your primary channel      
Strategy Depth x2 if entering a crowded category      
Measurement Always x2. No measurement, no program.      
Content Quality x1      
Stage Fit x1, but treat below 3 as disqualifying      

A few interpretation notes. Any agency scoring 1 or 2 on measurement is selling AEO without the ability to prove it works; that’s disqualifying regardless of total. A high total with a low stage fit means a good agency for someone else. And if every answer in your calls was smooth but nothing was shown live, subtract a point everywhere. Demos are evidence, decks are claims.


Where We Land

Soulcraft is an AEO agency, so we have a horse in this race and you can weight this section accordingly. We built the agency to score well on exactly these criteria. Agentic systems doing the recurring work, identity-first strategy through our soul.md process, continuous monitoring of AI visibility and sentiment with tools we’ll happily show you live, and a deliberate focus on Series A through C companies at $2,500 to $10,000 per month.

We’d rather be scored than take your trust on credit. Bring us this scorecard and we’ll answer every question on it, live, including the one about what would make you a bad fit.