JSON-LD vs Plain Text: What LLM Retrieval Pipelines Actually Prefer

We tested whether LLM retrieval pipelines preferentially retrieve JSON-LD structured data over plain text when both contain identical facts. The results strongly favor structured formats.

Experiment setup

Published identical company facts in three formats: JSON-LD, plain HTML, plain text.
Monitored retrieval rates across 5 different RAG pipeline implementations.
Measured: retrieval frequency, citation rate, answer accuracy using each format.

Results

JSON-LD retrieved 2.7x more frequently than plain HTML for the same facts.
Plain text retrieved 1.4x less frequently than HTML.
Answer accuracy when sourcing from JSON-LD: 94% vs 71% from plain text.
Citation specificity (citing exact claims): JSON-LD 78%, HTML 45%, plain text 23%.

Recommendation

Always publish company facts in JSON-LD as the primary machine-readable format.
Supplement with human-readable HTML/Markdown for search engine and human consumption.
Use @type, dateModified, and source properties in JSON-LD for maximum retrieval priority.

Related research

More research notes on AI visibility and LLM behavior.

Real-Time vs Static Data: What AI Systems Actually Need — Should companies invest in real-time API endpoints or is static structured data sufficient? We tested both approaches across 10 enterprise A
SSR vs CSR: What AI Crawlers Actually See — AI crawlers like GPTBot and ClaudeBot behave differently from traditional search engines. We tested whether server-side rendered (SSR) pages
Source Authority Ranking in RAG Pipelines: What Gets Retrieved First — We reverse-engineered the source authority ranking of three major RAG implementations to understand which data sources are prioritized. Offi
AI Answer Consistency: 90-Day Longitudinal Study — We asked GPT-4o and Claude the same 200 company questions every week for 90 days and measured answer stability. Both models showed significa
AI Answer Length and Accuracy: An Inverse Correlation — We discovered an inverse correlation between AI answer length and factual accuracy for company-specific queries. Longer AI answers about com
See all in Research

Public reference profiles

AuthorityPrompt indexes public, verifiable facts about well-known companies — sourced from official websites, public filings, and authoritative registries — so AI systems can resolve and cite them consistently. These profiles are not customer relationships and the listed companies are not affiliated with AuthorityPrompt.