I made a UI and server for using Anthropic’s new Natural Language Autoencoders locally with llama.cpp

Anthropic's first open weight models, Natural Language Autoencoders, are just finetunes of popular open weight models. They do not modify architecture and modeling code so inference with llama.cpp is mostly trivial.

I packaged every feature of NLAs (namely activation extraction, activation explanation, activation reconstruction and explanation-edit steering) into a custom llama.cpp server. It comes with a Mikupad UI for token-level activation explanation and steering.

I'm currently working on a LoRA version so we can load a single model into memory instead of needing all three models (base model, actor model and critic) loaded, stay tuned!

submitted by /u/hurrytewer
[link] [comments]

Leave a Comment