A Detailed Account of Compositional Automata Learning through Alphabet Refinement
arXiv:2504.16624v3 Announce Type: replace
Abstract: Active automata learning infers automaton models of systems from behavioral observations, a technique successfully applied to a wide range of domains. Compositional approaches have recently emerged to address scalability to concurrent systems. We take a significant step beyond available results, including those by the authors, and develop a general technique for compositional learning of a synchronizing parallel system with an unknown decomposition. Our approach automatically refines the global alphabet into component alphabets while learning the component models. We develop a theoretical treatment of distributions of alphabets, i.e., sets of possibly overlapping component alphabets, characterize counter-examples that reveal inconsistencies with global observations, and show how to systematically update the distribution to restore consistency. We extend $L^{\star}$ to handle partial and potentially spurious information arising when learning components from global observations only. We establish correctness and termination of the full algorithm. We provide an implementation, called CoalA, using the state-of-the-art active learning library LearnLib. Our experiments on more than 630 subject systems show that CoalA delivers up to five orders of magnitude fewer membership queries than monolithic learning, and achieves better scalability in equivalence queries on systems with significant concurrency.