cs.AI

Lightweight LLM Agent Memory with Small Language Models

arXiv:2604.07798v3 Announce Type: replace
Abstract: Although LLM agents can leverage tools for complex tasks, they still need memory to maintain cross-turn consistency and accumulate reusable information in long-horizon interactions. However, retrieva…

cs.LG

Super Apriel: One Checkpoint, Many Speeds

arXiv:2604.19877v1 Announce Type: new
Abstract: We release Super Apriel, a 15B-parameter supernet in which every decoder layer provides four trained mixer choices — Full Attention (FA), Sliding Window Attention (SWA), Kimi Delta Attention (KDA), and …

cs.AI, cs.CY

Fairness Testing of Large Language Models in Role-Playing

arXiv:2411.00585v2 Announce Type: replace-cross
Abstract: Large Language Models (LLMs) have become foundational in modern language-driven software applications, profoundly influencing daily life. A critical technique in leveraging their potential is r…

Scroll to Top