Scale AI and similar services charge a lot for annotation. MTurk is cheap but the quality is horrible for anything requiring real domain understanding.
For small teams that need a few thousand labeled examples to calibrate their evals or fine tune a model, there seems to be no good middle ground.
How is everyone handling this? Are you doing it manually or has anyone found something that actually works?
[link] [comments]