cs.AI, cs.CL

Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants

arXiv:2510.24328v2 Announce Type: replace
Abstract: Large Language Models (LLMs) are increasingly used to answer everyday questions, yet their performance on culturally grounded and dialectal content remains uneven across languages. We propose a compr…