cs.LG, cs.SE

Clotho: Measuring Task-Specific Pre-Generation Test Adequacy for LLM Inputs

arXiv:2509.17314v3 Announce Type: replace-cross
Abstract: Software increasingly relies on the emergent capabilities of Large Language Models (LLMs), from natural language understanding to program analysis and generation. Yet testing them on specific t…