Trained a 3B Llama with GRPO on a chemistry RL environment. Drug-like molecules in 6 hours of GPU time. [P]
What if AI could design a new drug in 30 seconds? That's the future I'm betting on. Maybe 5 years out. Maybe 15. But it's coming. This week I built a tiny piece of it. A reinforcement learning environment where an AI agent designs dru…