LocalLLaMA

Efficient pretraining with token superposition by Nous Research

submitted by /u/de4dee [link] [comments]