Self-Consistency Through Diverse Sampling

When a problem has multiple valid reasoning paths, sampling diverse approaches and finding consensus often beats picking the “best” single approach.

The self-consistency technique generates multiple reasoning chains by sampling from the model’s decoder, then identifies the most consistent answer by marginalizing across chains.

The insight: problems requiring thoughtful analysis often entail greater reasoning diversity. A single greedy path may land on a locally-optimal-but-wrong answer. Diversity plus voting surfaces the answer that survives multiple approaches.

This is the wisdom-of-crowds principle applied to a single model’s outputs.

Benchmark gains over baseline CoT:

  • GSM8K: +17.9%
  • SVAMP: +11.0%
  • AQuA: +12.2%
  • StrategyQA: +6.4%

Related: 05-molecule—chain-of-thought-prompting