ai · arxiv/cs.LG · 8 min
Three-Phase Transformer: Structural Prior for Decoder Efficiency
A residual-stream architecture using cyclic channel partitioning and phase-aligned rotations achieves 7% perplexity gains with minimal parameter overhead.
Apr 17, 2026 Read →