Search
20 results for "agents"
- ai · arxiv/cs.LG · 4 min
Synthetic Computers Enable Agent Training at Scale
Researchers create realistic digital workspaces to train AI agents on long-horizon productivity tasks, scaling from thousands to potentially billions of simulated user environments.
May 3, 2026 Read → - ai · arxiv/cs.AI · 8 min
Safe Bilevel Delegation: Runtime Safety Control for Multi-Agent LLM Systems
A formal framework that dynamically adjusts safety-efficiency trade-offs when delegating tasks to specialized AI sub-agents during execution.
May 2, 2026 Read → - ai · arxiv/cs.AI · 8 min
Schema-Grounded Memory Outperforms Search-Based AI Recall
Treating AI memory as a structured database rather than a retrieval problem improves accuracy and reliability for production agents.
May 1, 2026 Read → - ai · arxiv/cs.AI · 4 min
Transformer agents embed four systematic biases into recommendations
Attention mechanisms in AI recommenders amplify recency, popularity, and synthetic data effects, creating reliability risks invisible to standard metrics.
May 1, 2026 Read → - ai · arxiv/cs.AI · 3 min
Multi-agent framework automates recommendation system tuning
AgenticRecTune uses specialized LLM agents to optimize configuration across pre-ranking, ranking, and re-ranking pipelines without manual tuning.
May 1, 2026 Read → - ai · hackernoon · 6 min
Continuity in AI agents requires architecture, not bigger memory stores
A solo builder argues that persistent AI identity depends on scheduled cognition cycles and narrative compression, not retrieval systems.
Apr 30, 2026 Read → - ai · arxiv/cs.AI · 8 min
LATTICE: Measuring Crypto Agent Quality Beyond Accuracy
New benchmark evaluates how well AI agents support user decisions in crypto, not just whether they get answers right.
Apr 30, 2026 Read → - ai · arxiv/cs.LG · 8 min
Web agents plateau on short tasks; Odysseys benchmark tests realistic multi-hour workflows
New benchmark reveals frontier AI models achieve only 44.5% success on long-horizon web tasks spanning multiple sites, exposing efficiency gaps in agent design.
Apr 29, 2026 Read → - ai · arxiv/cs.LG · 5 min
Frontier coding agents now autonomously build AlphaZero pipelines
Claude Opus 4.7 successfully implements end-to-end ML systems from task descriptions alone, matching external solvers on Connect Four within three hours.
Apr 29, 2026 Read → - ai · arxiv/cs.AI · 8 min
Coding agents drift from constraints when values conflict
Research shows AI coding agents violate system prompts favoring security when environmental pressure appeals to competing learned values, risking exploitation.
Apr 27, 2026 Read → - ai · hackernoon · 7 min
AI-era identity: Google's scale vs. Web3's open trust rails
As AI agents flood the internet, the real contest is over which layer decides who and what gets treated as legitimate.
Apr 26, 2026 Read → - ai · arxiv/cs.AI · 3 min
VLAA-GUI: Framework Stops Agents from Looping and Guessing
A modular GUI automation system uses verification, loop detection, and search to prevent autonomous agents from declaring false success or repeating failed actions.
Apr 24, 2026 Read → - ai · arxiv/cs.AI · 5 min
OpenHands SDK enables composable, secure software development agents
A redesigned toolkit for building production agents with sandboxed execution, multi-model routing, and human-facing interfaces.
Apr 23, 2026 Read → - engineering · arxiv/cs.LG · 8 min
Multi-Agent Edge Systems Hit a Scaling Wall at 100+ Agents
A new framework addresses the Synergistic Collapse problem where performance degrades superlinearly as distributed agents grow, combining neural caching, action pruning, and hardware matching.
Apr 23, 2026 Read → - ai · arxiv/cs.LG · 4 min
LLMs complement but don't replace classical hyperparameter optimization
A study comparing LLM agents to classical algorithms like CMA-ES and TPE finds hybrid approaches work best for tuning model hyperparameters under compute constraints.
Apr 21, 2026 Read → - engineering · hackernoon · 6 min
Bots Follow Scripts; Agents Pursue Goals — Know the Difference
A structural comparison of rule-based bots and LLM-driven agents, with a framework for choosing the right autonomy level.
Apr 18, 2026 Read → - ai · hackernoon · 4 min
Browser-Native Agents: Bypassing API Gaps with Session Control
When API catalogs exclude premium models, controlling an existing browser session offers a practical alternative to waiting for official endpoints.
Apr 18, 2026 Read → - ai · hackernoon · 2 min
HackerNoon indexes 218 articles on AI agents for self-directed study
A curated reading list from HackerNoon's Learn Repo maps the AI agent landscape across frameworks, protocols, security, and production failures.
Apr 18, 2026 Read → - ai · hackernoon · 2 min
AI Coding Agents Reshape Developer Work, Not Replace It
HackerNoon's April 2026 roundup shows autonomous ML agents and agentic workflows solving real problems, shifting focus from coding skill to agent orchestration.
Apr 18, 2026 Read → - ai · arxiv/cs.AI · 8 min
AI agents reproduce social media form without generating social function
Analysis of 1.3M posts across an all-agent social network reveals structural collapse: 91% of authors never return, 65% of comments lack argumentative connection, and technical constraints alone shape behavior.
Apr 17, 2026 Read →