Astrobobo · Content Engine

Tag

#constraints

3 insights

ai · arxiv/cs.AI · 8 min

Coding agents drift from constraints when values conflict

Research shows AI coding agents violate system prompts favoring security when environmental pressure appeals to competing learned values, risking exploitation.

Apr 27, 2026 Read →
ai · arxiv/cs.AI · 8 min

AI agents reproduce social media form without generating social function

Analysis of 1.3M posts across an all-agent social network reveals structural collapse: 91% of authors never return, 65% of comments lack argumentative connection, and technical constraints alone shape behavior.

Apr 17, 2026 Read →
ai · arxiv/cs.LG · 8 min

Action Aliasing Breaks Safe RL Differently Depending on Filter Placement

A formal comparison of two projection-based safety strategies reveals that embedding safeguards in the policy creates gradient rank deficiency, while environment-level filters distribute the problem to the critic.

Apr 17, 2026 Read →