LocalLLaMA

Developing open source LLM from ground up from pretrain – rlhf(PPO/GRPO)

Hello I have been working on creating a LLM from ground up. It is based on deepseek architecture with heavily VRAM footprint reduced optimized(GUM+muon) Currently this is the json schema I am using which should suffice as to what currently is being pre…