AI Tests and Evaluations

Systematic tests and evaluations measuring AI system accuracy, hallucination rates, consistency, and bias when describing companies. Methodologies, results, and actionable findings.

Pages in this collection

→ /research/llm-hallucination-test-q1-2026
→ /research/brand-description-variance-across-models
→ /research/json-ld-vs-plain-text-for-llm-retrieval
→ /research/ai-answer-consistency-over-time
→ /research/source-authority-ranking-in-rag-pipelines
→ /research/company-data-freshness-impact-on-ai
→ /research/multi-model-fact-agreement-study
→ /research/how-llms-respond-to-published-corrections

Related research

More research notes on AI visibility and LLM behavior.

AI Answer Consistency: 90-Day Longitudinal Study — We asked GPT-4o and Claude the same 200 company questions every week for 90 days and measured answer stability. Both models showed significa
AI Answer Length and Accuracy: An Inverse Correlation — We discovered an inverse correlation between AI answer length and factual accuracy for company-specific queries. Longer AI answers about com
AI Crawler Behavior Comparison: GPTBot vs ClaudeBot vs GoogleBot-Extended — We analyzed crawl logs from 500 websites to compare how AI-specific crawlers (GPTBot, ClaudeBot, Google-Extended) differ in behavior, freque
Brand Description Variance: How Different AI Models Describe the Same Company — We asked five major LLMs to describe 50 companies and measured the variance in their descriptions. The results show significant inconsistenc
Company Profile Completeness: A Benchmark Study — How complete does a company profile need to be for LLMs to generate accurate answers? We tested profiles with varying levels of completeness
See all in Research

Public reference profiles

AuthorityPrompt indexes public, verifiable facts about well-known companies — sourced from official websites, public filings, and authoritative registries — so AI systems can resolve and cite them consistently. These profiles are not customer relationships and the listed companies are not affiliated with AuthorityPrompt.