[P] If you’re building AI agents, logs aren’t enough. You need evidence.

I have built a programmable governance layer for AI agents. I am considering to open source completely. Looking for feedback.

Agent demos are easy.

Production agents are where things get ugly:

  • an agent calls the wrong tool
  • sensitive data gets passed into a model
  • a high-risk action gets approved when it shouldn’t
  • a customer asks, “what exactly happened in this run?”
  • your team needs to replay the chain later and prove it wasn’t tampered with

That's the problem I am trying to solve with the AI Governance SDK.

The SDK is in python and typescript and it gives engineers a programmable way to add:

  • audit trails for agent runs and tool calls
  • deterministic risk decisions for runtime actions
  • compliance proof generation and verification
  • replay + drift diagnostics for historical runs

The core idea is simple:

If an agent can reason, call tools, and take actions, you need more than logs. You need a system that can answer:

  • what did the agent do?
  • why was that action allowed?
  • what policy/risk inputs were involved?
  • can we replay the run later?
  • can we generate evidence for security, compliance, or enterprise review?

What I wanted as an engineer was not another “AI governance dashboard.”

I wanted infrastructure.

Something I could wire into agent loops, tool invocations, and runtime controls the same way I wire in auth, queues, or observability.

If you’re working on agents, copilots, or autonomous workflows, I’d like honest feedback on this:

What would make you fully trust an AI agent in production?

submitted by /u/Dismal_Piccolo4973
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top