cs.AI

Distilling LLM Reasoning into Graph of Concept Predictors

arXiv:2602.03006v2 Announce Type: replace
Abstract: Deploying Large Language Models (LLMs) for discriminative workloads is often limited by inference latency, compute, and API costs at scale. Active distillation reduces these costs by querying an LLM …