Prompting Technique Taxonomy
The Framework
A systematic classification of 58 text-based prompting techniques organized into six problem-solving categories. Each category represents a different strategy for eliciting improved outputs from language models.
The Six Categories
1. Few-Shot Prompting (In-Context Learning) Provide examples demonstrating desired behavior. The model learns the task pattern from the examples themselves. Techniques include K-Nearest Neighbor selection, Vote-K, Self-Generated ICL, and various exemplar optimization approaches.
2. Zero-Shot Prompting Rely on instructions alone without examples. Includes role prompting, style prompting, emotion prompting, rephrase-and-respond, and self-ask techniques. Works when the task description is sufficient to specify behavior.
3. Thought Generation Encourage explicit reasoning before answering. Chain-of-Thought (CoT) is the canonical example, show or elicit reasoning steps. Includes zero-shot CoT (“let’s think step by step”), step-back prompting, analogical prompting, and tabular CoT variations.
4. Decomposition Break complex tasks into manageable subtasks. Least-to-most prompting, plan-and-solve, tree-of-thought, skeleton-of-thought. Effective when problems have natural hierarchical structure.
5. Ensembling Generate multiple outputs and aggregate. Self-consistency (vote across reasoning paths), prompt paraphrasing (vary the prompt itself), mixture of reasoning experts. Trades compute for reliability.
6. Self-Criticism Have the model evaluate and refine its own outputs. Self-refine, chain-of-verification, self-calibration. Leverages the model’s ability to critique as well as generate.
When to Use What
- Clear task, few examples available: Zero-shot with clear instructions
- Have good examples: Few-shot, optimize exemplar design
- Reasoning-heavy problems: Few-shot Chain-of-Thought
- Complex multi-step tasks: Decomposition approaches
- Need reliability over speed: Ensembling
- Outputs need refinement: Self-criticism loops
Limitations
This taxonomy reflects the literature as of 2024. New techniques emerge constantly. Categories overlap, many effective prompts combine techniques across categories. The taxonomy describes what exists, not what will work best for any specific application.
The gap between technique popularity and measured effectiveness suggests treating popularity with skepticism and validating techniques empirically for your use case.
Related: 05-molecule—exemplar-design-principles, 05-atom—prompting-vs-prompt-engineering, 05-atom—ai-vs-human-prompt-optimization