Six Factors of Exemplar Effectiveness
The effectiveness of examples in few-shot prompting depends on six key design decisions. Small adjustments across these factors can improve accuracy by up to 90%.
Quantity: More examples generally improve performance, up to context window limits. Diminishing returns apply.
Order: Sequence matters. Alternating between positive and negative examples can reduce recency bias where models overweight the final example.
Label Distribution: Balance the types of examples to avoid skewing predictions toward overrepresented categories.
Quality: Clear, accurate, relevant examples outperform noisy or ambiguous ones. Garbage in, garbage out.
Format: Consistent formatting across examples improves model comprehension. The model learns the pattern, not just the content.
Similarity: Examples that closely match the target task boost accuracy. Semantic similarity to the test case matters.
These factors interact (optimizing one while ignoring others yields inconsistent results.
Related: 05-atom—in-context-learning-definition, 05-atom—prompt-component-taxonomy, 05-molecule—exemplar-design-principles