Author name: John T. Halloran

Leveraging RAG for Training-Free Alignment of LLMs

John T. Halloran / May 13, 2026

arXiv:2605.11217v1 Announce Type: new
Abstract: Large language model (LLM) alignment algorithms typically consist of post-training over preference pairs. While such algorithms are widely used to enable safety guardrails and align LLMs with general hum…

cs.AI, cs.CR, cs.LG

Understanding the Effects of Safety Unalignment on Large Language Models

John T. Halloran / April 6, 2026

arXiv:2604.02574v1 Announce Type: cross
Abstract: Safety alignment has become a critical step to ensure LLMs refuse harmful requests while providing helpful and harmless responses. However, despite the ubiquity of safety alignment for deployed frontie…