Nisharg Nargund, Priyesh Shukla

TernaryLM: Memory-Efficient Language Modeling via Native 1.5-Bit Quantization with Adaptive Layer-wise Scaling

Nisharg Nargund, Priyesh Shukla / March 30, 2026

arXiv:2602.07374v2 Announce Type: replace
Abstract: Large language models (LLMs) achieve remarkable performance but demand substantial computational resources, limiting deployment on edge devices and resource-constrained environments. We present Terna…

Author name: Nisharg Nargund, Priyesh Shukla

TernaryLM: Memory-Efficient Language Modeling via Native 1.5-Bit Quantization with Adaptive Layer-wise Scaling