Astrobobo · Content Engine

Search

8 results for "transformer"

ai · arxiv/cs.LG · 4 min

Selective-Update RNNs Match Transformers While Using Less Memory

A new RNN architecture learns when to update internal state, preserving memory across long sequences and reducing computational waste on redundant input.

May 3, 2026 Read →
ai · arxiv/cs.AI · 4 min

Transformer agents embed four systematic biases into recommendations

Attention mechanisms in AI recommenders amplify recency, popularity, and synthetic data effects, creating reliability risks invisible to standard metrics.

May 1, 2026 Read →
ai · arxiv/cs.LG · 8 min

Model Architecture Controls Whether Errors Stay Hidden

Transformer design determines if internal decision signals remain observable after training, independent of output confidence metrics.

Apr 29, 2026 Read →
ai · arxiv/cs.AI · 5 min

Transformers learn graph connectivity selectively, not universally

New research shows transformers can infer transitive relations on grid-structured graphs but fail on fragmented ones, with scaling helping only certain architectures.

Apr 23, 2026 Read →
engineering · arxiv/cs.AI · 4 min

Dual Transformers Improve Bug Assignment Accuracy by 10%+

TriagerX uses two transformer models and developer interaction history to recommend the right engineer for bug fixes, outperforming single-model approaches.

Apr 20, 2026 Read →
ai · arxiv/cs.LG · 8 min

Distilling Transformers into Mamba via Linearized Attention

A two-stage knowledge transfer method preserves Transformer performance in State Space Models by routing through linearized attention as an intermediate step.

Apr 17, 2026 Read →
ai · arxiv/cs.LG · 8 min

Three-Phase Transformer: Structural Prior for Decoder Efficiency

A residual-stream architecture using cyclic channel partitioning and phase-aligned rotations achieves 7% perplexity gains with minimal parameter overhead.

Apr 17, 2026 Read →
ai · arxiv/cs.LG · 3 min

Transformer models outperform CNNs in prostate MRI segmentation

SwinUNETR achieves 5-point Dice improvement over standard UNet when trained on mixed-reader datasets, suggesting transformer attention handles annotation variability better.

Apr 17, 2026 Read →