/u/party-horse - Provide.ai

Qwen3-1.7B fine-tuned on synthetic data outperforms GLM-5 (744B) on multi-turn tool-calling: 437x smaller, trained from noisy production traces

/u/party-horse / April 15, 2026

TL;DR: We fine-tuned Qwen3-1.7B on synthetic data generated from noisy production traces. It scores 0.853 avg across 5 corruption scenarios on SGD tool, calling, beating GLM-5 (744B) at 0.835. Training directly on the same traces drops 14-28pp. A…

Author name: /u/party-horse

Qwen3-1.7B fine-tuned on synthetic data outperforms GLM-5 (744B) on multi-turn tool-calling: 437x smaller, trained from noisy production traces