INT4 Quantization Fails After FP32 Convergence in Predictable Phases
Post-training quantization assumes converged models are ready to compress, but INT4 quantization collapses in a three-phase pattern tied to weight updates, not learning rate decay.
INT4 quantization fails after FP32 convergence in three phases: improvement, plateau, then explosive divergence caused by post-convergence weight updates.
- — Three-phase divergence: rapid learning, meta-stable plateau (~70k steps), explosive INT4 gap growth (11% to 517%).
- — Divergence onset correlates with FP32 perplexity convergence, not learning rate decay schedule.
- — INT8 quantization remains robust across all phases; failure is specific to INT4's 16-level grid coarseness.
- — Weight outlier accumulation ruled out via kurtosis measurement; mechanism remains in weight distribution shift.
- — Oscillatory Lock-In schedule reduces INT4 gap by 2.2 percentage points; SGDR accelerates divergence uniformly.
- — Study audits all 154 Pythia-160m public checkpoints with calibration-free per-group INT4 probe.
- — Post-convergence weight updates, not decay magnitude alone, are the proximate cause of quantization collapse.
- — Schedule amplitude calibration determines whether perturbation helps or hurts quantization robustness.
Astrobobo tool mapping
- Knowledge Capture Document the three-phase pattern (rapid learning, plateau, divergence) and the onset predictor (FP32 convergence, not LR decay) in your model development notes. Link to the paper's checkpoint audit results.
- Focus Brief Create a one-page checklist: (1) Is INT4 or INT8 required? (2) If INT4, does your schedule include amplitude-calibrated oscillations? (3) When does FP32 perplexity converge? (4) Have you probed INT4 gap post-convergence?
- Reading Queue Queue the paper's supplementary materials (code, probe implementation, checkpoint audit) for detailed review before implementing quantization in your pipeline.
Frequently asked
- Post-convergence weight updates shift the weight distribution in ways that exceed INT4's 16-level quantization grid resolution. The divergence is not caused by learning rate decay magnitude alone, but by the specific pattern of weight changes after FP32 perplexity stops improving. INT8's finer grid (256 levels) remains robust, suggesting the failure is tied to INT4's coarseness.
cite ▸
Marcus Armstrong. (2026, April 17). INT4 Quantization Fails After FP32 Convergence in Predictable Phases. Astrobobo Content Engine (rewrite of arxiv/cs.LG). https://astrobobo-content-engine.vercel.app/article/int4-quantization-fails-after-fp32-convergence-in-predictable-phases-e5db56
Marcus Armstrong. "INT4 Quantization Fails After FP32 Convergence in Predictable Phases." Astrobobo Content Engine, 17 Apr 2026, https://astrobobo-content-engine.vercel.app/article/int4-quantization-fails-after-fp32-convergence-in-predictable-phases-e5db56. Based on "arxiv/cs.LG", https://arxiv.org/abs/2604.15167.
@misc{astrobobo_int4-quantization-fails-after-fp32-convergence-in-predictable-phases-e5db56_2026,
author = {Marcus Armstrong},
title = {INT4 Quantization Fails After FP32 Convergence in Predictable Phases},
year = {2026},
url = {https://astrobobo-content-engine.vercel.app/article/int4-quantization-fails-after-fp32-convergence-in-predictable-phases-e5db56},
note = {Astrobobo rewrite of arxiv/cs.LG, https://arxiv.org/abs/2604.15167},
}