Large Language Model Prompt Datasets: An In-depth Analysis and Insights
arXiv:2510.09316v2 Announce Type: replace-cross
Abstract: We compile 129 heterogeneous LLM prompt datasets (>1.22 TB, >673M instances) into a structured taxonomy and conduct a multi-level linguistic analysis (lexical, syntactic, and semantic) on seven…