Universally Empowering Zeroth-Order Optimization via Adaptive Layer-wise Sampling
arXiv:2604.18264v1 Announce Type: new
Abstract: Zeroth-Order optimization presents a promising memory-efficient paradigm for fine-tuning Large Language Models by relying solely on forward passes. However, its practical adoption is severely constrained…