ai · arxiv/cs.AI · 8 min
GEM activation functions match ReLU speed with smoother gradients
Krause proposes rational activation functions with tunable smoothness that reduce optimization friction in deep networks while maintaining computational efficiency.
Apr 24, 2026 Read →