Measuring information density in web pages from an LLM agent’s perspective [R]
Posting some empirical measurements that might be useful to others working on RAG / agentic systems. Setup: 100 URLs across 5 categories (news, ecommerce, docs, social, SaaS marketing), 20 each. Two extractors run in parallel per URL: (a) naive HTML-to…