ai · 8 min read · Apr 24, 2026

Supervised Learning Has Built-In Geometric Blindness

Mathematical proof shows empirical risk minimization must preserve sensitivity to label-correlated but test-irrelevant features—a structural constraint, not a training bug.

Source: arxiv/cs.AI · Vishal Rajput · open original ↗

Supervised learning mathematically requires encoders to retain sensitivity to training-label correlations that don't generalize, creating an unavoidable geometric constraint.

  • ERM imposes necessary Jacobian sensitivity in directions correlated with labels but irrelevant at test time.
  • This constraint unifies four separate empirical phenomena: non-robust features, texture bias, corruption fragility, robustness-accuracy tradeoff.
  • Trajectory Deviation Index (TDI) directly measures this blind spot; standard metrics like Frobenius norm miss it.
  • PGD adversarial training achieves high Jacobian magnitude but poor clean-input geometry (TDI 1.336 vs PMH 0.904).
  • Blind spot worsens in larger language models (ratio 0.860→0.742 from 66M to 340M parameters).
  • Task-specific ERM fine-tuning amplifies the blind spot by 54%; PMH repairs it 11x with one Gaussian-form training term.
  • Defect appears at foundation-model scale across vision, NLP, and multimodal architectures (CLIP, DINO, SAM, ViT-B/16).
  • Proposition 5 proves the repair term is the unique perturbation law that uniformly penalizes encoder Jacobian.

Astrobobo tool mapping

  • Knowledge Capture Document the distinction between Jacobian magnitude (Frobenius norm) and isotropic path-length distortion (TDI) in your model evaluation playbook. Note that high Frobenius does not guarantee clean-input robustness.
  • Focus Brief Summarize Theorem 1 (the geometric blind spot) and Proposition 5 (the unique repair form) as a one-page reference for your ML team. Include the PMH loss term and its Gaussian structure.
  • Daily Log When running next adversarial training experiment, log both Frobenius norm and TDI side-by-side. Record whether they diverge; if so, flag for deeper investigation.

Frequently asked

  • It is a mathematical necessity of empirical risk minimization: any encoder trained to minimize supervised loss must retain non-zero sensitivity (Jacobian) in directions that correlate with training labels but are irrelevant at test time. This is not a bug in current methods but a structural property of the supervised objective itself, proven in Theorem 1.
Share X LinkedIn
cite
APA
Vishal Rajput. (2026, April 24). Supervised Learning Has Built-In Geometric Blindness. Astrobobo Content Engine (rewrite of arxiv/cs.AI). https://astrobobo-content-engine.vercel.app/article/supervised-learning-has-built-in-geometric-blindness-0a1a7e
MLA
Vishal Rajput. "Supervised Learning Has Built-In Geometric Blindness." Astrobobo Content Engine, 24 Apr 2026, https://astrobobo-content-engine.vercel.app/article/supervised-learning-has-built-in-geometric-blindness-0a1a7e. Based on "arxiv/cs.AI", https://arxiv.org/abs/2604.21395.
BibTeX
@misc{astrobobo_supervised-learning-has-built-in-geometric-blindness-0a1a7e_2026,
  author       = {Vishal Rajput},
  title        = {Supervised Learning Has Built-In Geometric Blindness},
  year         = {2026},
  url          = {https://astrobobo-content-engine.vercel.app/article/supervised-learning-has-built-in-geometric-blindness-0a1a7e},
  note         = {Astrobobo rewrite of arxiv/cs.AI, https://arxiv.org/abs/2604.21395},
}

Related insights