cs.AI

VCBench: Benchmarking LLMs in Venture Capital

arXiv:2509.14448v2 Announce Type: replace
Abstract: Benchmarks such as SWE-bench and ARC-AGI demonstrate how shared datasets accelerate progress toward artificial general intelligence (AGI). We introduce VCBench, the first benchmark for predicting fou…

cs.CL, cs.LG

Conceptors for Semantic Steering

arXiv:2605.04980v1 Announce Type: new
Abstract: Activation-based steering provides control of LLM behavior at inference time, but the dominant paradigm reduces each concept to a single direction whose geometry is left largely unexamined. Rather than s…

Scroll to Top