Training a 1.1B SLM at home

Hey all. Thought I'd share my journey. I've been fascinated with AI and LLMs, and started building apps for consumer devices (phones) and realized the market for fast, usable models for consumer hardware has felt more like an afterthought than a primary purpose. So I spent a lot of time (with the help of my own AIs) learning, researching, and designing an architecture for an SLM. After several weeks and trying different iterations of designs, I came up with an architecture that can run at 80+ tok/sec on CPU only.

The model is called JTech-Nano, a 1.1B parameter SLM. No GPU needed for inference. The goal is a genuinely useful AI that runs on your phone/laptop/whatever with zero internet, zero API keys, zero cloud bills and performs efficiently.

I'm now in the process of training it on my own hardware at home, targeting 100B tokens before switching to fine tuning. No cluster. No funding. No team of 50 ML engineers. Just a lot of sleepless nights watching loss curves and making sure the training regimen is running.

Here's what 50B tokens of training looks like. The spike in purple is when I adjusted the learning rate schedule at 3am. The model recovered and is back on track to learning... and the training continues on.

I've used r/LocalLlama a ton when I first entered the 'run at home' AI segment. I plan on releasing this model as soon as its smart enough to be useful. Hopefully not in the too distant future.

https://preview.redd.it/4cxw9ggiwrtg1.png?width=1226&format=png&auto=webp&s=ccca5230dea6687363d47fd9be7672af5553e1a8

submitted by /u/JordanJtech
[link] [comments]

Leave a Comment