LLM as an (Opinionated) Judge
Image made by the author using ChatGPTThe evaluation problemBuilding systems around large language models has become standard practice. Use cases span a wide spectrum; from narrow, well-defined tasks like classifying spam emails, to open-ended ones lik…