Six Factors of Exemplar Effectiveness

The effectiveness of examples in few-shot prompting depends on six key design decisions. Small adjustments across these factors can improve accuracy by up to 90%.

Quantity: More examples generally improve performance, up to context window limits. Diminishing returns apply.

Order: Sequence matters. Alternating between positive and negative examples can reduce recency bias where models overweight the final example.

Label Distribution: Balance the types of examples to avoid skewing predictions toward overrepresented categories.

Quality: Clear, accurate, relevant examples outperform noisy or ambiguous ones. Garbage in, garbage out.

Format: Consistent formatting across examples improves model comprehension. The model learns the pattern, not just the content.

Similarity: Examples that closely match the target task boost accuracy. Semantic similarity to the test case matters.

These factors interact (optimizing one while ignoring others yields inconsistent results.

Related: 05-atom—in-context-learning-definition, 05-atom—prompt-component-taxonomy, 05-molecule—exemplar-design-principles