Tag
#efficiency
11 insights
- ai · arxiv/cs.LG · 4 min
Selective-Update RNNs Match Transformers While Using Less Memory
A new RNN architecture learns when to update internal state, preserving memory across long sequences and reducing computational waste on redundant input.
May 3, 2026 Read → - ai · arxiv/cs.LG · 8 min
Web agents plateau on short tasks; Odysseys benchmark tests realistic multi-hour workflows
New benchmark reveals frontier AI models achieve only 44.5% success on long-horizon web tasks spanning multiple sites, exposing efficiency gaps in agent design.
Apr 29, 2026 Read → - ai · arxiv/cs.LG · 4 min
Efficient Rationale Retrieval via Student-Teacher Distillation
Rabtriever reduces computational cost of LLM-based document ranking by distilling cross-encoder knowledge into independent query-document encoders.
Apr 28, 2026 Read → - ai · arxiv/cs.AI · 4 min
Automated quantization shrinks spike-driven language models for edge devices
QSLM framework compresses neural network models by up to 86.5% while preserving accuracy, enabling deployment on resource-constrained embedded hardware.
Apr 22, 2026 Read → - ai · arxiv/cs.LG · 8 min
Dataset Distillation Fails Without Hard Labels
Soft labels mask poor dataset quality in distillation methods, making random subsets nearly as effective as curated ones.
Apr 22, 2026 Read → - ai · arxiv/cs.LG · 4 min
Quantum-LSTM hybrid cuts physics model training data by 100×
Federated learning with quantum-enhanced LSTM achieves classical accuracy on SUSY classification using 20K samples instead of 2M, with under 300 parameters.
Apr 20, 2026 Read → - ai · arxiv/cs.AI · 8 min
Token Importance in On-Policy Distillation: Entropy and Disagreement
Research identifies two regions of high-value tokens in knowledge distillation: high-entropy positions and low-entropy positions where student and teacher disagree, enabling 50–80% token reduction.
Apr 17, 2026 Read → - ai · arxiv/cs.AI · 8 min
Small Models Match Large Ones via Inference Scaffolding
McClendon et al. show that role-based prompt structuring at inference time doubles small-model performance on complex tasks without retraining.
Apr 17, 2026 Read → - ai · arxiv/cs.LG · 8 min
Foundation Models vs. Task-Specific ML in Electricity Price Forecasting
Time series foundation models outperform traditional deep learning on probabilistic forecasts, but well-tuned conventional models remain competitive at lower computational cost.
Apr 17, 2026 Read → - ai · arxiv/cs.LG · 8 min
Distilling Transformers into Mamba via Linearized Attention
A two-stage knowledge transfer method preserves Transformer performance in State Space Models by routing through linearized attention as an intermediate step.
Apr 17, 2026 Read → - ai · arxiv/cs.LG · 8 min
Three-Phase Transformer: Structural Prior for Decoder Efficiency
A residual-stream architecture using cyclic channel partitioning and phase-aligned rotations achieves 7% perplexity gains with minimal parameter overhead.
Apr 17, 2026 Read →