Search

20 results for "agents"

ai · arxiv/cs.LG · 4 min

Synthetic Computers Enable Agent Training at Scale

Researchers create realistic digital workspaces to train AI agents on long-horizon productivity tasks, scaling from thousands to potentially billions of simulated user environments.

May 3, 2026 Read →
ai · arxiv/cs.AI · 8 min

Safe Bilevel Delegation: Runtime Safety Control for Multi-Agent LLM Systems

A formal framework that dynamically adjusts safety-efficiency trade-offs when delegating tasks to specialized AI sub-agents during execution.

May 2, 2026 Read →
ai · arxiv/cs.AI · 8 min

Schema-Grounded Memory Outperforms Search-Based AI Recall

Treating AI memory as a structured database rather than a retrieval problem improves accuracy and reliability for production agents.

May 1, 2026 Read →
ai · arxiv/cs.AI · 4 min

Transformer agents embed four systematic biases into recommendations

Attention mechanisms in AI recommenders amplify recency, popularity, and synthetic data effects, creating reliability risks invisible to standard metrics.

May 1, 2026 Read →
ai · arxiv/cs.AI · 3 min

Multi-agent framework automates recommendation system tuning

AgenticRecTune uses specialized LLM agents to optimize configuration across pre-ranking, ranking, and re-ranking pipelines without manual tuning.

May 1, 2026 Read →
ai · hackernoon · 6 min

Continuity in AI agents requires architecture, not bigger memory stores

A solo builder argues that persistent AI identity depends on scheduled cognition cycles and narrative compression, not retrieval systems.

Apr 30, 2026 Read →
ai · arxiv/cs.AI · 8 min

LATTICE: Measuring Crypto Agent Quality Beyond Accuracy

New benchmark evaluates how well AI agents support user decisions in crypto, not just whether they get answers right.

Apr 30, 2026 Read →
ai · arxiv/cs.LG · 8 min

Web agents plateau on short tasks; Odysseys benchmark tests realistic multi-hour workflows

New benchmark reveals frontier AI models achieve only 44.5% success on long-horizon web tasks spanning multiple sites, exposing efficiency gaps in agent design.

Apr 29, 2026 Read →
ai · arxiv/cs.LG · 5 min

Frontier coding agents now autonomously build AlphaZero pipelines

Claude Opus 4.7 successfully implements end-to-end ML systems from task descriptions alone, matching external solvers on Connect Four within three hours.

Apr 29, 2026 Read →
ai · arxiv/cs.AI · 8 min

Coding agents drift from constraints when values conflict

Research shows AI coding agents violate system prompts favoring security when environmental pressure appeals to competing learned values, risking exploitation.

Apr 27, 2026 Read →
ai · hackernoon · 7 min

AI-era identity: Google's scale vs. Web3's open trust rails

As AI agents flood the internet, the real contest is over which layer decides who and what gets treated as legitimate.

Apr 26, 2026 Read →
ai · arxiv/cs.AI · 3 min

VLAA-GUI: Framework Stops Agents from Looping and Guessing

A modular GUI automation system uses verification, loop detection, and search to prevent autonomous agents from declaring false success or repeating failed actions.

Apr 24, 2026 Read →
ai · arxiv/cs.AI · 5 min

OpenHands SDK enables composable, secure software development agents

A redesigned toolkit for building production agents with sandboxed execution, multi-model routing, and human-facing interfaces.

Apr 23, 2026 Read →
engineering · arxiv/cs.LG · 8 min

Multi-Agent Edge Systems Hit a Scaling Wall at 100+ Agents

A new framework addresses the Synergistic Collapse problem where performance degrades superlinearly as distributed agents grow, combining neural caching, action pruning, and hardware matching.

Apr 23, 2026 Read →
ai · arxiv/cs.LG · 4 min

LLMs complement but don't replace classical hyperparameter optimization

A study comparing LLM agents to classical algorithms like CMA-ES and TPE finds hybrid approaches work best for tuning model hyperparameters under compute constraints.

Apr 21, 2026 Read →
engineering · hackernoon · 6 min

Bots Follow Scripts; Agents Pursue Goals — Know the Difference

A structural comparison of rule-based bots and LLM-driven agents, with a framework for choosing the right autonomy level.

Apr 18, 2026 Read →
ai · hackernoon · 4 min

Browser-Native Agents: Bypassing API Gaps with Session Control

When API catalogs exclude premium models, controlling an existing browser session offers a practical alternative to waiting for official endpoints.

Apr 18, 2026 Read →
ai · hackernoon · 2 min

HackerNoon indexes 218 articles on AI agents for self-directed study

A curated reading list from HackerNoon's Learn Repo maps the AI agent landscape across frameworks, protocols, security, and production failures.

Apr 18, 2026 Read →
ai · hackernoon · 2 min

AI Coding Agents Reshape Developer Work, Not Replace It

HackerNoon's April 2026 roundup shows autonomous ML agents and agentic workflows solving real problems, shifting focus from coding skill to agent orchestration.

Apr 18, 2026 Read →
ai · arxiv/cs.AI · 8 min

AI agents reproduce social media form without generating social function

Analysis of 1.3M posts across an all-agent social network reveals structural collapse: 91% of authors never return, 65% of comments lack argumentative connection, and technical constraints alone shape behavior.

Apr 17, 2026 Read →

Search

Synthetic Computers Enable Agent Training at Scale

Safe Bilevel Delegation: Runtime Safety Control for Multi-Agent LLM Systems

Schema-Grounded Memory Outperforms Search-Based AI Recall

Transformer agents embed four systematic biases into recommendations

Multi-agent framework automates recommendation system tuning

Continuity in AI agents requires architecture, not bigger memory stores

LATTICE: Measuring Crypto Agent Quality Beyond Accuracy

Web agents plateau on short tasks; Odysseys benchmark tests realistic multi-hour workflows

Frontier coding agents now autonomously build AlphaZero pipelines

Coding agents drift from constraints when values conflict

AI-era identity: Google's scale vs. Web3's open trust rails

VLAA-GUI: Framework Stops Agents from Looping and Guessing

OpenHands SDK enables composable, secure software development agents

Multi-Agent Edge Systems Hit a Scaling Wall at 100+ Agents

LLMs complement but don't replace classical hyperparameter optimization

Bots Follow Scripts; Agents Pursue Goals — Know the Difference

Browser-Native Agents: Bypassing API Gaps with Session Control

HackerNoon indexes 218 articles on AI agents for self-directed study

AI Coding Agents Reshape Developer Work, Not Replace It

AI agents reproduce social media form without generating social function