MORPHOGEN: A Multilingual Benchmark for Evaluating Gender-Aware Morphological Generation
arXiv:2604.18914v1 Announce Type: cross
Abstract: While multilingual large language models (LLMs) perform well on high-level tasks like translation and question answering, their ability to handle grammatical gender and morphological agreement remains …