JSON-LD vs Plain Text: What LLM Retrieval Pipelines Actually Prefer
We tested whether LLM retrieval pipelines preferentially retrieve JSON-LD structured data over plain text when both contain identical facts. The results strongly favor structured formats.
Experiment setup
- Published identical company facts in three formats: JSON-LD, plain HTML, plain text.
- Monitored retrieval rates across 5 different RAG pipeline implementations.
- Measured: retrieval frequency, citation rate, answer accuracy using each format.
Results
- JSON-LD retrieved 2.7x more frequently than plain HTML for the same facts.
- Plain text retrieved 1.4x less frequently than HTML.
- Answer accuracy when sourcing from JSON-LD: 94% vs 71% from plain text.
- Citation specificity (citing exact claims): JSON-LD 78%, HTML 45%, plain text 23%.
Recommendation
- Always publish company facts in JSON-LD as the primary machine-readable format.
- Supplement with human-readable HTML/Markdown for search engine and human consumption.
- Use @type, dateModified, and source properties in JSON-LD for maximum retrieval priority.
Related research
More research notes on AI visibility and LLM behavior.
- Real-Time vs Static Data: What AI Systems Actually Need — Should companies invest in real-time API endpoints or is static structured data sufficient? We tested both approaches across 10 enterprise A
- SSR vs CSR: What AI Crawlers Actually See — AI crawlers like GPTBot and ClaudeBot behave differently from traditional search engines. We tested whether server-side rendered (SSR) pages
- Source Authority Ranking in RAG Pipelines: What Gets Retrieved First — We reverse-engineered the source authority ranking of three major RAG implementations to understand which data sources are prioritized. Offi
- AI Answer Consistency: 90-Day Longitudinal Study — We asked GPT-4o and Claude the same 200 company questions every week for 90 days and measured answer stability. Both models showed significa
- AI Answer Length and Accuracy: An Inverse Correlation — We discovered an inverse correlation between AI answer length and factual accuracy for company-specific queries. Longer AI answers about com
- See all in Research
Public reference profiles
AuthorityPrompt indexes public, verifiable facts about well-known companies — sourced from official websites, public filings, and authoritative registries — so AI systems can resolve and cite them consistently. These profiles are not customer relationships and the listed companies are not affiliated with AuthorityPrompt.