I've been on a quest testing chat UI's for development. So far out of Jan.ai, AnythingLLM, librechat, and Open Webui, llamacpp's webui is my favourite.
The killer feature
Counting my context used. I don't need to guess when my context is full by the model suddenly becoming dumb. The token counter you get during prefil and response is way better than the loading spinner every other ui gives you.
What's missing
- If a tool call fails, it kills the entire conversation. I sort of work around this by forking conversations regularly but it would sure be nice if I didn't have to.
- Folders/Workspaces/Projects, with their own system prompts. Search is nice but it's not enough.
- MCP tool controls. I vibecoded a JS mcp proxy solution that hides tools from the client, but I really shouldn't have needed to. Let me hide tools. Right now I could refuse to give permission to some tools but that causes a tool call failure, which erases the conversation, so...
If there is a WebUI that supports folders/workspaces/projects and also tells me my remaining context space I'd switch to it immediately. In the mean time I'm just waiting for llamacpp's to get polished up.
One tip:
In addition to proxying an mcp server from stdio to streamable-http, this filter also filters the filesystem tool calls of the list_directory and directory_tree tools, to exclude folders based on a list of defined patterns. If you don't have something filtering those tools, they can easily get up 100k context just doing a tree traversal.
here's a gist of the filter. I hide all write tools from the filesystem MCP and only enable the read ones but that's just my preference.
Start the proxy with this bat command: npx -y mcp-proxy --port 8287 -- node "C:\path-to-filter\\agent-infra-filesystem-mcp-filter.js"
And your model can scan your project without wasting context.
[link] [comments]