Closed-Loop LLM Discovery of Non-Standard Channel Priors in Vision Models

arXiv:2601.08517v2 Announce Type: replace Abstract: Channel-configuration search, the optimization of layer specifications such as channel widths in deep neural networks, presents a combinatorial challenge constrained by tensor-shape compatibility and computational budgets. We investigate whether large language models (LLMs) can support neural architecture search (NAS) by reasoning over architectural code structures in ways that complement traditional search heuristics. We apply an LLM-driven NAS framework to channel-configuration search, formulating the task as conditional code generation in which the LLM refines architectural specifications using performance feedback. To address data scarcity, we generate a corpus of valid, shape-consistent architectures through abstract syntax tree (AST) mutations. Although these mutated networks are not necessarily optimized for performance, they provide structural examples that help the LLM learn executable architectural patterns and relate channel configurations to model performance. Experimental results on CIFAR-100 show that the closed-loop LLM improves upon the initial AST-generated architecture population under the same proxy-evaluation protocol. Our analysis further shows that the generated architectures reflect domain-specific design patterns, including non-standard channel widths and late-stage expansion, highlighting the potential of language-driven design for code-level NAS. The code and prompts are publicly available at https://github.com/ABrain-One/NN-GPT, and the generated deep neural networks are published at https://github.com/ABrain-One/NN-Dataset under model names with the prefix ast-dimension-.

Leave a Comment