Seriously, I just audited my stack and realized I’m spending more on rotating residential proxies than I am on the actual Claude and OpenAI API calls. Between Cloudflare Turnstile blocking my headless browsers and the massive raw HTML payloads eating my context window, my data ingestion layer is a financial black hole. Are you guys actually profitable running these web-research agents, or is everyone just eating the scraping costs right now? There has to be a better way to get clean Markdown from websites.
[link] [comments]