| Hey everyone, I've been working on a repo where I implement large language model architectures using the simplest PyTorch code possible. No bloated frameworks, no magic abstractions — just clean, readable code that shows exactly what's happening under the hood. The mission is simple: make LLM internals approachable. If you've ever wanted to understand how these models actually work — not just use them — this is the kind of place where you can read the code and actually follow it. Right now it has a GPT implementation with: - A clean decoder-only transformer - Flash attention support - A minimal trainer with loss tracking - Support for CPU and GPU with Multiple precision [link] [comments] |