Self-Consistency Through Diverse Sampling
When a problem has multiple valid reasoning paths, sampling diverse approaches and finding consensus often beats picking the “best” single approach.
The self-consistency technique generates multiple reasoning chains by sampling from the model’s decoder, then identifies the most consistent answer by marginalizing across chains.
The insight: problems requiring thoughtful analysis often entail greater reasoning diversity. A single greedy path may land on a locally-optimal-but-wrong answer. Diversity plus voting surfaces the answer that survives multiple approaches.
This is the wisdom-of-crowds principle applied to a single model’s outputs.
Benchmark gains over baseline CoT:
- GSM8K: +17.9%
- SVAMP: +11.0%
- AQuA: +12.2%
- StrategyQA: +6.4%