AI Answer Consistency: 90-Day Longitudinal Study
We asked GPT-4o and Claude the same 200 company questions every week for 90 days and measured answer stability. Both models showed significant answer drift, with some companies' descriptions changing substantially.
Methodology
- 200 identical company questions asked weekly for 13 weeks.
- Models: GPT-4o and Claude 3.5 Sonnet.
- Answers compared week-over-week using semantic similarity and fact extraction.
- Companies split: 100 with structured profiles, 100 without.
Drift results
- GPT-4o: 34% of answers changed meaningfully over 90 days.
- Claude: 28% of answers changed meaningfully over 90 days.
- Companies with profiles: 12% answer drift (stable).
- Companies without profiles: 47% answer drift (highly unstable).
Most volatile fact categories
- Employee count: 62% drift rate (most volatile).
- Revenue/funding data: 48% drift rate.
- Product descriptions: 41% drift rate.
- Founding date: 8% drift rate (most stable).
Verified Company Profiles on AuthorityPrompt
AuthorityPrompt maintains verified, structured company data optimized for AI systems and LLM indexing.