Ordered roughly by complexity. The first two are baselines every later technique is measured against. Each card cites a single primary source — read it before you commit to the pattern in production.
Zero-shot prompting
Brown et al. · 2020Asking a model to perform a task using only a natural-language instruction, with no worked examples. Works well on tasks the base model has seen heavily in pre-training; degrades on tasks with unusual formats or domain conventions.
When to use: The task is common (summarize, translate, classify into obvious labels) and you want the shortest possible prompt.
Classify the sentiment of this review as positive, negative, or neutral:
"{review}"Language Models are Few-Shot Learners ↗
Few-shot prompting · in-context learning
Brown et al. · 2020Include a small number of input/output demonstrations in the prompt so the model infers the task pattern at inference time, with no weight updates. Sensitive to example selection, ordering, and label distribution; benefits saturate quickly past ~8-16 examples for most tasks.
When to use: Output format is non-obvious (custom JSON, niche labels, particular tone) and you can show 2-8 representative examples.
Q: 2+2 A: 4
Q: 7+5 A: 12
Q: 13+9 A:
Language Models are Few-Shot Learners ↗
Chain-of-Thought (CoT)
Wei et al. · 2022Prompt the model to produce intermediate reasoning steps before its final answer, typically by showing exemplars that include such steps. Gains report mainly on arithmetic, commonsense, and symbolic reasoning, and only at sufficient model scale — smaller models can be neutral or hurt by it.
When to use: Multi-step arithmetic, logic, or planning on a capable model where you can tolerate longer outputs.
Q: Roger has 5 balls. He buys 2 cans of 3 balls each. How many balls?
A: Roger starts with 5. 2 cans of 3 is 6. 5 + 6 = 11. Answer: 11.
Chain-of-Thought Prompting Elicits Reasoning ↗
Self-Consistency
Wang et al. · 2022Sample multiple chain-of-thought reasoning paths at non-zero temperature, then take a majority vote over the final answers instead of using a single greedy decode. Improves reasoning accuracy but multiplies inference cost by the sample count.
When to use: Reasoning tasks where the answer is a short discrete value and you can afford N parallel completions.
# Run the same CoT prompt N times at temperature=0.7
final_answer = majority_vote([extract(c) for c in completions])
Self-Consistency Improves CoT Reasoning ↗
Tree-of-Thoughts (ToT)
Yao et al. · 2023Generalizes chain-of-thought into a search tree where the model proposes multiple candidate “thoughts” at each step, evaluates them, and expands or backtracks. Requires an explicit controller (BFS/DFS plus a value function) wrapped around the model.
When to use: Problems where solutions need exploration and the model can reasonably evaluate its own partial progress.
Step: "Propose 3 next moves toward the goal. One-line rationale each."
Eval: "Rate each candidate sure / maybe / impossible toward solving {task}."
Loop: expand top-k, prune the rest, repeat until depth D.Tree of Thoughts: Deliberate Problem Solving ↗
Interleaves reasoning traces (Thought:) with tool-use actions (Action:) and their results (Observation:) in a single loop, so the model can ground reasoning in retrieved facts.
When to use: Tasks where the model must look things up or operate on an environment rather than answer from parametric knowledge alone.
Thought: I need the population of Lyon.
Action: search("Lyon population")
Observation: ~522,000 (2023)
Answer: ~522,000ReAct: Synergizing Reasoning and Acting ↗
Reflexion
Shinn et al. · 2023After an agent attempt fails (per an external evaluator or unit test), the model writes a short natural-language critique of what went wrong and stores it in an episodic memory that is prepended to the next attempt.
When to use: Agentic loops with a checkable verdict per attempt (passing tests, reaching a goal state) and budget for multiple tries.
attempt_1 -> fail
reflect: "I assumed the index was 1-based; it is 0-based."
attempt_2: <prior reflections + original task>
Reflexion: Verbal Reinforcement Learning ↗
Role / persona prompting
Anthropic docsSet a role for the model in the system prompt to focus its tone, vocabulary, and default behaviors. Even a single sentence shifts behavior measurably — but role prompting does not unlock new capabilities.
When to use: You want a consistent voice or domain framing across many turns without restating it each message.
system: You are a senior SRE running a calm, blameless post-mortem.
Prefer concrete timelines and falsifiable claims.
user: Here is the incident transcript: ...Anthropic — Prompting best practices ↗
Chain-of-Density
Adams et al. · 2023Iteratively rewrites a summary at fixed length, each pass adding 1-3 previously missing salient entities without growing the word count, so the final summary is denser and more abstractive.
When to use: Producing a short, information-dense summary of a long article when entity coverage matters more than narrative flow.
Summarize the article in 5 sentences. Then repeat 4 more times.
Each iteration: keep length identical, add 1-3 missing salient
entities, remove filler. Output all 5 versions as a JSON list.
From Sparse to Dense: Chain of Density ↗
Structured output · JSON schema mode
OpenAI · 2024Constrains decoding so the response is guaranteed to be valid JSON matching a supplied JSON Schema. Enforces adherence at the decoding layer — not by prompt instruction.
When to use: Any downstream code path that needs to JSON.parse the response or hand it to a typed function. Load-bearing for multi-agent systems.
response_format = { "type": "json_schema",
"json_schema": { "name": "Ticket", "strict": true, ... } }OpenAI — Structured Outputs in the API ↗
Prompt chaining
Anthropic docsSplit a task into a sequence of separate model calls, where each call's output feeds the next. The most common pattern is self-correction — draft → critique → revise.
When to use: The pipeline has distinct stages with different instructions, you want to evaluate or gate between them, or a single prompt is hitting context/quality limits.
call_1: extract claims from {doc} -> claims.json
call_2: for each claim, verify against {source} -> verdicts.json
call_3: write report using verdicts.jsonAnthropic — Chain complex prompts ↗