ReAct: Synergizing Reasoning and Acting in Language Models
Authors: Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
Venue: ICLR 2023
Institutions: Princeton University, Google Research (Brain team)
Core Argument
LLM reasoning (chain-of-thought) and acting (action generation) have been studied separately, but humans naturally interleave them. ReAct prompts LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, creating synergy: reasoning helps induce, track, and update action plans while actions allow gathering additional information from external sources.
Key Results
- HotpotQA: ReAct→CoT-SC achieves 35.1% EM vs. 33.4% for CoT-SC alone
- FEVER: CoT-SC→ReAct achieves 64.6% accuracy vs. 60.4% for CoT-SC
- ALFWorld: 71% success rate vs. 37% for imitation learning baseline (BUTLER)
- WebShop: 40% success rate vs. 28.7% for IL+RL methods
Key Findings
Failure Mode Analysis (HotpotQA, n=200):
- CoT hallucination rate: 56% of failures
- ReAct hallucination rate: 0% of failures
- ReAct reasoning error rate: 47% (due to structural constraints reducing flexibility)
- CoT false positive rate: 14% vs. ReAct 6%
Sparse vs. Dense Reasoning:
- Knowledge tasks benefit from dense thought (thought-action-observation every step)
- Decision-making tasks benefit from sparse thought (only at key decision points)
Theoretical Grounding
References Vygotsky’s “inner speech” theory, verbal reasoning enables self-regulation, strategization, and working memory maintenance. The cooking analogy: between actions, we reason to track progress, handle exceptions, adjust plans, and realize when external information is needed.
Extracted Content
→ 05-atom—reasoning-grounding-tradeoff → 05-atom—hallucination-from-ungrounded-reasoning → 05-atom—sparse-vs-dense-reasoning → 05-molecule—thought-action-observation-pattern → 07-molecule—act-to-reason-reason-to-act