ReAct: Synergizing Reasoning and Acting in Language Models

Authors: Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
Venue: ICLR 2023
Institutions: Princeton University, Google Research (Brain team)

Core Argument

LLM reasoning (chain-of-thought) and acting (action generation) have been studied separately, but humans naturally interleave them. ReAct prompts LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, creating synergy: reasoning helps induce, track, and update action plans while actions allow gathering additional information from external sources.

Key Results

  • HotpotQA: ReAct→CoT-SC achieves 35.1% EM vs. 33.4% for CoT-SC alone
  • FEVER: CoT-SC→ReAct achieves 64.6% accuracy vs. 60.4% for CoT-SC
  • ALFWorld: 71% success rate vs. 37% for imitation learning baseline (BUTLER)
  • WebShop: 40% success rate vs. 28.7% for IL+RL methods

Key Findings

Failure Mode Analysis (HotpotQA, n=200):

  • CoT hallucination rate: 56% of failures
  • ReAct hallucination rate: 0% of failures
  • ReAct reasoning error rate: 47% (due to structural constraints reducing flexibility)
  • CoT false positive rate: 14% vs. ReAct 6%

Sparse vs. Dense Reasoning:

  • Knowledge tasks benefit from dense thought (thought-action-observation every step)
  • Decision-making tasks benefit from sparse thought (only at key decision points)

Theoretical Grounding

References Vygotsky’s “inner speech” theory, verbal reasoning enables self-regulation, strategization, and working memory maintenance. The cooking analogy: between actions, we reason to track progress, handle exceptions, adjust plans, and realize when external information is needed.

Extracted Content

05-atom—reasoning-grounding-tradeoff05-atom—hallucination-from-ungrounded-reasoning05-atom—sparse-vs-dense-reasoning05-molecule—thought-action-observation-pattern07-molecule—act-to-reason-reason-to-act