LocalLLaMA

235M param LLM from scratch on a single RTX 5080

Hey everyone, Been working on this for a while and figured I'd share it here too. I made a small transformer language model completely from scratch in PyTorch. No pretrained weights, no HuggingFace downloads. Every parameter was trained from raw te…