| Hey r/LocalLLaMA — I open-sourced a tool that brings eval-driven development to AI agent skills. It's based on Anthropic's official skill-creator for Claude Code, but rewritten in TypeScript to work with OpenCode (which supports 300+ models including local ones). The problem: creating skills for AI agents is trial-and-error. You write a skill, test it manually, and hope it triggers on the right prompts. There's no systematic way to measure if a skill works. What this does:
The most interesting part for this community: it works with any of OpenCode's supported models. If you're running local models through OpenCode, you can use this tool with them. One-command install: Apache 2.0 license. Based on Anthropic's skill-creator with attribution. GitHub: https://github.com/antongulin/opencode-skill-creator npm: https://www.npmjs.com/package/opencode-skill-creator Happy to answer questions about the eval methodology, local model support, or architecture. [link] [comments] |