I got a surprise bill. Nothing catastrophic, but enough to make me dig into why — an agent had hit a retry loop and kept calling the API for hours. There's no way to set a hard cap on the Anthropic or OpenAI APIs. You can get an email after the fact, but nothing that actually stops requests mid-flight.
So I built a proxy. You swap one environment variable, it routes through Costile instead of calling Anthropic directly, and when you hit your daily or monthly limit it blocks further requests immediately. No SDK changes, no code refactor. Took me about a weekend. Currently supports Anthropic, with OpenAI next.
It's MIT licensed and self-hostable in about 5 minutes. Try the demo at costile.com if you want to poke at it.
I've got anomaly detection on the roadmap, but I'm second-guessing the scope — is surfacing cost spikes enough, or do people actually need to know why the agent went off the rails? The former is straightforward to build, the latter is a much harder problem. Curious where others would draw that line.
GitHub: https://github.com/Mkiza/ai-agent-cost
Comments URL: https://news.ycombinator.com/item?id=47765314
Points: 1
# Comments: 2