Astrobobo · Content Engine

Search

9 results for "performance"

ai · arxiv/cs.AI · 8 min

Testing POMDP Policies Against Sensor Drift and Model Mismatch

New framework quantifies how much observation noise a decision policy can tolerate before performance collapses, with polynomial-time algorithms for real systems.

Apr 26, 2026 Read →
ai · arxiv/cs.AI · 4 min

Cross-Entropy Loss Drives Neural Probe Performance, Not Architecture

Pre-registered study shows cross-entropy training inflates logit norms 15x, accounting for most K-way energy probe gains over softmax baselines.

Apr 24, 2026 Read →
engineering · arxiv/cs.LG · 8 min

Multi-Agent Edge Systems Hit a Scaling Wall at 100+ Agents

A new framework addresses the Synergistic Collapse problem where performance degrades superlinearly as distributed agents grow, combining neural caching, action pruning, and hardware matching.

Apr 23, 2026 Read →
ai · arxiv/cs.LG · 4 min

Weak Labels Fail Across Time Even When Domain Transfer Works

A study of CRISPR experiments reveals supervision drift—where the labeling mechanism itself shifts—causes model collapse in temporal transfer despite strong in-domain performance.

Apr 21, 2026 Read →
ai · arxiv/cs.AI · 4 min

Interpretable Traces Don't Guarantee Better LLM Reasoning

Research shows Chain-of-Thought traces improve model performance but confuse users, and correctness of intermediate steps barely predicts final accuracy.

Apr 20, 2026 Read →
ai · arxiv/cs.AI · 5 min

LLMs Can Infer Unspoken Intent in Collaborative Tasks

Researchers tested whether large language models can interpret incomplete instructions by reasoning about a human partner's mental state, matching human performance.

Apr 20, 2026 Read →
ai · arxiv/cs.LG · 8 min

Chromatic Clustering Requires New Algorithms to Match Standard Performance

Adding color constraints to correlation clustering increases computational difficulty; a new coupled approach recovers optimal approximation bounds.

Apr 20, 2026 Read →
ai · arxiv/cs.AI · 8 min

Small Models Match Large Ones via Inference Scaffolding

McClendon et al. show that role-based prompt structuring at inference time doubles small-model performance on complex tasks without retraining.

Apr 17, 2026 Read →
ai · arxiv/cs.LG · 8 min

Distilling Transformers into Mamba via Linearized Attention

A two-stage knowledge transfer method preserves Transformer performance in State Space Models by routing through linearized attention as an intermediate step.

Apr 17, 2026 Read →