What happens when you rip out the residual stream and replace it with a structured workspace (Research Paper – CWT)
Over the last month I've been working on a custom architecture that fully replaces the residual stream transformers use with a structured workspace. The goal isn't to claim "I beat transformers", it's a thought experiment into wha…