ai · 8 min read · May 1, 2026

LLMs Need Feedback Loops to Keep Code and Theory Aligned

Researchers propose Comet-H, a system that orchestrates language models through iterative cycles to prevent hallucination and desynchronization in research software development.

Source: arxiv/cs.AI · Halley Young, Nikolaj Bj\"orner · open original ↗

LLMs drift when code, theory, and claims evolve separately; Comet-H couples them via iterative prompting and workspace state tracking.

  • LLMs generate code and text well but struggle when specifications change mid-project.
  • Hallucination accumulation: unsupported claims propagate across sessions without grounding.
  • Desynchronization: code, theory, and the model's internal world model fall out of sync.
  • Comet-H uses a contextual bandit approach to select prompts based on workspace deficits.
  • A controller tracks unfinished work with a decay function and re-validates docs against code.
  • A3 static-analysis tool built entirely within Comet-H reached F1=0.768 vs 0.364 baseline.
  • Audit-and-contraction passes dominate successful project trajectories in later phases.
  • Transparent scoring and fading work records make each prompt choice legible and bounded.

Astrobobo tool mapping

  • Knowledge Capture Record the three gaps you identified as structured items: claim, code reference, benchmark reference. Use this as your workspace state baseline.
  • Focus Brief Create a daily prompt checklist: (1) Does the README match the code? (2) Do benchmarks support the claims? (3) What is unfinished? Review before each coding session.
  • Reading Queue Queue the Comet-H paper and one recent LLM orchestration paper (e.g., on agent loops) to understand state-machine design patterns for LLM workflows.

Frequently asked

  • Hallucination accumulation occurs when unsupported claims made by an LLM in one session are treated as fact in later sessions, propagating errors. Desynchronization happens when code, mathematical theory, and the model's internal understanding of the project fall out of alignment, causing the model to generate inconsistent or contradictory outputs. Both arise because LLMs lack persistent workspace state across sessions.
Share X LinkedIn
cite
APA
Halley Young, Nikolaj Bj\"orner. (2026, May 1). LLMs Need Feedback Loops to Keep Code and Theory Aligned. Astrobobo Content Engine (rewrite of arxiv/cs.AI). https://astrobobo-content-engine.vercel.app/article/llms-need-feedback-loops-to-keep-code-and-theory-aligned-83b33c
MLA
Halley Young, Nikolaj Bj\"orner. "LLMs Need Feedback Loops to Keep Code and Theory Aligned." Astrobobo Content Engine, 1 May 2026, https://astrobobo-content-engine.vercel.app/article/llms-need-feedback-loops-to-keep-code-and-theory-aligned-83b33c. Based on "arxiv/cs.AI", https://arxiv.org/abs/2604.27209.
BibTeX
@misc{astrobobo_llms-need-feedback-loops-to-keep-code-and-theory-aligned-83b33c_2026,
  author       = {Halley Young, Nikolaj Bj\"orner},
  title        = {LLMs Need Feedback Loops to Keep Code and Theory Aligned},
  year         = {2026},
  url          = {https://astrobobo-content-engine.vercel.app/article/llms-need-feedback-loops-to-keep-code-and-theory-aligned-83b33c},
  note         = {Astrobobo rewrite of arxiv/cs.AI, https://arxiv.org/abs/2604.27209},
}

Related insights