Search
9 results for "performance"
- ai · arxiv/cs.AI · 8 min
Testing POMDP Policies Against Sensor Drift and Model Mismatch
New framework quantifies how much observation noise a decision policy can tolerate before performance collapses, with polynomial-time algorithms for real systems.
Apr 26, 2026 Read → - ai · arxiv/cs.AI · 4 min
Cross-Entropy Loss Drives Neural Probe Performance, Not Architecture
Pre-registered study shows cross-entropy training inflates logit norms 15x, accounting for most K-way energy probe gains over softmax baselines.
Apr 24, 2026 Read → - engineering · arxiv/cs.LG · 8 min
Multi-Agent Edge Systems Hit a Scaling Wall at 100+ Agents
A new framework addresses the Synergistic Collapse problem where performance degrades superlinearly as distributed agents grow, combining neural caching, action pruning, and hardware matching.
Apr 23, 2026 Read → - ai · arxiv/cs.LG · 4 min
Weak Labels Fail Across Time Even When Domain Transfer Works
A study of CRISPR experiments reveals supervision drift—where the labeling mechanism itself shifts—causes model collapse in temporal transfer despite strong in-domain performance.
Apr 21, 2026 Read → - ai · arxiv/cs.AI · 4 min
Interpretable Traces Don't Guarantee Better LLM Reasoning
Research shows Chain-of-Thought traces improve model performance but confuse users, and correctness of intermediate steps barely predicts final accuracy.
Apr 20, 2026 Read → - ai · arxiv/cs.AI · 5 min
LLMs Can Infer Unspoken Intent in Collaborative Tasks
Researchers tested whether large language models can interpret incomplete instructions by reasoning about a human partner's mental state, matching human performance.
Apr 20, 2026 Read → - ai · arxiv/cs.LG · 8 min
Chromatic Clustering Requires New Algorithms to Match Standard Performance
Adding color constraints to correlation clustering increases computational difficulty; a new coupled approach recovers optimal approximation bounds.
Apr 20, 2026 Read → - ai · arxiv/cs.AI · 8 min
Small Models Match Large Ones via Inference Scaffolding
McClendon et al. show that role-based prompt structuring at inference time doubles small-model performance on complex tasks without retraining.
Apr 17, 2026 Read → - ai · arxiv/cs.LG · 8 min
Distilling Transformers into Mamba via Linearized Attention
A two-stage knowledge transfer method preserves Transformer performance in State Space Models by routing through linearized attention as an intermediate step.
Apr 17, 2026 Read →