Nanochat vs Llama for training from scratch? [P]
Hey all – I'm engaged in a project training a model entirely on historical data, which I've posted about before on this subreddit. My last training run was done using Nanochat, and while that was very successful for pretraining and SFT of the i…