Search
6 results for "reasoning"
- ai · arxiv/cs.AI · 8 min
Junk Data Degrades LLM Reasoning; Twitter Study Shows Lasting Harm
Continual training on low-quality social media text causes measurable cognitive decline in language models, with reasoning and safety capabilities dropping significantly.
Apr 23, 2026 Read → - ai · arxiv/cs.LG · 8 min
Chain-of-Thought Supervision Eliminates Sample Complexity Growth
New theoretical analysis shows intermediate reasoning steps remove dependence on generation length, while end-to-end learning scales unpredictably with sequence depth.
Apr 21, 2026 Read → - ai · arxiv/cs.AI · 4 min
Interpretable Traces Don't Guarantee Better LLM Reasoning
Research shows Chain-of-Thought traces improve model performance but confuse users, and correctness of intermediate steps barely predicts final accuracy.
Apr 20, 2026 Read → - ai · arxiv/cs.AI · 5 min
LLMs Can Infer Unspoken Intent in Collaborative Tasks
Researchers tested whether large language models can interpret incomplete instructions by reasoning about a human partner's mental state, matching human performance.
Apr 20, 2026 Read → - ai · arxiv/cs.AI · 4 min
MERRIN: Benchmark for Multimodal Search in Noisy Web Data
New benchmark reveals AI agents struggle with real-world web search, achieving only 22% accuracy when retrieving and reasoning across mixed media sources.
Apr 17, 2026 Read → - ai · arxiv/cs.AI · 8 min
LLMs hit formal reasoning ceiling; Chomsky Hierarchy reveals efficiency gap
New benchmark shows large language models struggle with structured complexity tasks and require prohibitive compute to achieve reliability in formal reasoning.
Apr 17, 2026 Read →