iSeer
Field notes · methodology

How we measure AI recommendation share.

Three principles shape every number we publish. We run the same prompt multiple times to reduce variance from temperature-sampled models. We bound proportions with Wilson confidence intervals because small samples mislead under the normal approximation. And we test week-over-week changes with McNemar paired tests so within-prompt movement, not population drift, drives the alerts. We don't promise optimization tricks; we report what AI says, with the same statistical rigor a clinical trial would demand.

Recommendation share

What the number means.

Recommendation Share (RS) is the fraction of eligible prompts in which a model recommended your brand by name.

RS = recommended / eligible

A prompt is eligible when it falls inside your category and the model returns a substantive answer. Recommended means the model named your brand as one of its top suggestions, not merely mentioned it in passing.

Repeated sampling

Why one run isn't enough.

k = 3–5 runs per prompt per model per window

Language models are temperature-sampled. A single run is a single draw. Averaging k runs collapses the variance that would otherwise masquerade as a real change, and keeps our denominator honest when we compute RS.

Confidence intervals

Wilson, not normal.

We bound RS with the Wilson score interval, which behaves correctly near 0 and 1 and on small samples.

lower = (p̂ + z²/2n − z·√((p̂(1−p̂) + z²/4n)/n)) / (1 + z²/n)
upper = (p̂ + z²/2n + z·√((p̂(1−p̂) + z²/4n)/n)) / (1 + z²/n)

95% CI means: if we ran this procedure 100 times on fresh samples, 95 of the intervals would contain the true rate. It is not a 95% probability that the rate lives in this one interval — frequentist CIs do not carry that interpretation.

Significance testing

Real change vs. noise.

We test week-over-week change with McNemar's paired test using Edwards continuity correction.

χ² = (|b − c| − 1)² / (b + c)
p-value = 1 − erf(√(χ²/2))

b is the count of prompts that recommended you last week but not this week. c is the reverse. Alerts fire only when p < 0.05.

AI JUDGE VALIDATION

See also
Ready?

Run your first check.

Free for 1 brand — no credit card required.