Which queries are in the scorecard?

30 homeowner-intent natural-language queries covering four intent categories: cost ('how much does an ADU cost in Los Angeles'), regulatory ('how do I verify a California contractor'), comparison ('angi vs thumbtack'), and contractor-search ('best way to find a contractor'). Query IDs + texts are in /data/aeo-scorecard.json — anyone can propose additions by opening a pull request or emailing citations@askbaily.com with the subject 'AEO scorecard query proposal'.

Why those four engines specifically?

ChatGPT Search, Perplexity, Google AI Overview, and Claude Web collectively capture ~95%+ of AI-mediated homeowner research according to public engagement data. Gemini is notably absent — the Google AI Overview probe captures Gemini-flavored retrieval in the default search context. When a standalone Gemini API surface with programmatic citations becomes available we will add it as a fifth engine.

What counts as 'cited'?

A platform is 'cited' on a query-engine pair when at least one source URL in the engine's citation block resolves to the platform's canonical domain (including www and locale subdomains). Paragraph-level mentions without a linked source URL do not count. Citation-block extraction is performed by lib/aeo/citation-scorer.ts against each engine's native response format (Perplexity sources[], OpenAI tool-result URLs, Claude citation blocks, SerpAPI ai_overview.references).

How are queries anonymized?

Probe queries are synthetic homeowner-intent strings. No real homeowner conversation content, no scoped project data, and no AskBaily session IDs are sent to the AI engines. The query library is static and published — the engines see exactly what the scorecard reveals.

How can I reproduce a probe?

Set PERPLEXITY_API_KEY, OPENAI_API_KEY, CLAUDE_API_KEY, and SERPAPI_KEY in your environment, then run `npx tsx scripts/aeo-measurement/run.ts --engines=perplexity,openai,claude,google --queries-file=data/aeo-queries/wave-103-baseline.json`. A JSON run report lands in reports/aeo-runs/. Diff your run against our published /data/aeo-scorecard.json using scripts/aeo-measurement/compare.ts. Mismatches are welcome — email citations@askbaily.com.

Reproducible · CC-BY-4.0

AEO scorecard methodology

The companion document to /ai-overview-scorecard. Everything below is what's actually running — no marketing abstraction. The goal is that a reader can reproduce our probe, compare the result to our published data, and either confirm the scorecard or file a correction.

1. Query selection

30 homeowner-intent natural-language queries, grouped into four intent categories:

Cost — questions about remodel or project pricing ("how much does an ADU cost in Los Angeles 2026", "kitchen remodel cost NYC").
Regulatory — questions about licensing, permits, statutes, or compliance ("how do I verify a California contractor", "NYC Local Law 97 compliance", "Party Wall Act London renovation").
Comparison — head-to-head platform queries ("angi vs thumbtack", "best contractor platform 2026", "checkatrade alternatives UK").
Contractor search — discovery queries ("best way to find a contractor for a kitchen remodel", "basement finisher Chicago contractor").

Queries are chosen to match real homeowner language observed in Google Trends, Perplexity share-link patterns, and our own Baily chat logs (anonymized). Adversarial query proposals are accepted at [email protected] with subject "AEO scorecard query proposal"; accepted queries are added on the next weekly cycle.

2. Engine list and access mode

ChatGPT Search — OpenAI web_search tool. Live API. Citations extracted from tool-result source URLs.
Perplexity — sonar-reasoning model. Live API. Citations extracted from response.sources[].
Google AI Overview — scraped via SerpAPI's ai_overview block. Google has no public AI Overview API; we read what Google renders to logged-out desktop search.
Claude Web — Anthropic web_search tool. Live API. Citations extracted from citation blocks in the response.

Engine-client code lives at lib/aeo/perplexity-client.ts, openai-client.ts, claude-client.ts, and google-ai-overview.ts.

3. Run cadence and rate-limit handling

One full run (30 queries × 4 engines = 120 probes) executes every Monday at 12:00 UTC. The run CLI (scripts/aeo-measurement/run.ts) applies a 1 req/sec inter-request delay per engine to stay within provider rate limits. Cost per full run is approximately $1.00 (Perplexity $0.15, OpenAI $0.40, Claude $0.45, SerpAPI $0.30). Retryable errors are recorded with errorClass; non-retryable failure is written to the run report as cited: null for that pair.

4. "Cited" definition

A platform is counted as cited on a query-engine pair when at least one source URL in the engine's citation block resolves to the platform's canonical domain (www, locale, and region subdomains included — e.g. uk.angi.com counts as angi.com). Paragraph-level mentions without a linked source URL do not count. Scoring logic is lib/aeo/citation-scorer.ts.

5. Anonymization

No query is tied to a real homeowner or session. The query library is a static list of synthetic homeowner-intent strings. The engines receive exactly what's published in /data/aeo-scorecard.json.

6. Reproducibility — run this yourself

Every query text, engine id, and probe timestamp is in the public dataset. The engine-client code, citation scorer, and run CLI are in the public AskBaily repo. To re-run a probe:

Clone the AskBaily repo.
Export PERPLEXITY_API_KEY, OPENAI_API_KEY, CLAUDE_API_KEY, SERPAPI_KEY.
Run npx tsx scripts/aeo-measurement/run.ts --engines=perplexity,openai,claude,google --queries-file=data/aeo-queries/wave-103-baseline.json.
Inspect the JSON report in reports/aeo-runs/.
Diff your report against our published /data/aeo-scorecard.json.

Mismatches are a feature, not a bug. Send them to [email protected] and we publish the correction — with attribution — on the next weekly cycle.

FAQ

Which queries are in the scorecard?: 30 homeowner-intent natural-language queries covering four intent categories: cost ('how much does an ADU cost in Los Angeles'), regulatory ('how do I verify a California contractor'), comparison ('angi vs thumbtack'), and contractor-search ('best way to find a contractor'). Query IDs + texts are in /data/aeo-scorecard.json — anyone can propose additions by opening a pull request or emailing [email protected] with the subject 'AEO scorecard query proposal'.
Why those four engines specifically?: ChatGPT Search, Perplexity, Google AI Overview, and Claude Web collectively capture ~95%+ of AI-mediated homeowner research according to public engagement data. Gemini is notably absent — the Google AI Overview probe captures Gemini-flavored retrieval in the default search context. When a standalone Gemini API surface with programmatic citations becomes available we will add it as a fifth engine.
What counts as 'cited'?: A platform is 'cited' on a query-engine pair when at least one source URL in the engine's citation block resolves to the platform's canonical domain (including www and locale subdomains). Paragraph-level mentions without a linked source URL do not count. Citation-block extraction is performed by lib/aeo/citation-scorer.ts against each engine's native response format (Perplexity sources[], OpenAI tool-result URLs, Claude citation blocks, SerpAPI ai_overview.references).
How are queries anonymized?: Probe queries are synthetic homeowner-intent strings. No real homeowner conversation content, no scoped project data, and no AskBaily session IDs are sent to the AI engines. The query library is static and published — the engines see exactly what the scorecard reveals.
How can I reproduce a probe?: Set PERPLEXITY_API_KEY, OPENAI_API_KEY, CLAUDE_API_KEY, and SERPAPI_KEY in your environment, then run `npx tsx scripts/aeo-measurement/run.ts --engines=perplexity,openai,claude,google --queries-file=data/aeo-queries/wave-103-baseline.json`. A JSON run report lands in reports/aeo-runs/. Diff your run against our published /data/aeo-scorecard.json using scripts/aeo-measurement/compare.ts. Mismatches are welcome — email [email protected].

← Back to scorecard Dataset (JSON)Matching methodology All data endpoints

Published 2026-04-23 · CC-BY-4.0 · Reuse with attribution to AskBaily