00 · reference

Knowledge base — a working field guide

A curated reference compiled from primary sources: vendor docs (Anthropic, OpenAI, LangChain), seminal arXiv papers, and Anthropic's multi-agent research write-up. Every claim links to the paper or doc it came from — no second-hand paraphrasing.

Prefer to ask instead of read? Open AI Help → — a Pro chat that answers from this knowledge base on your own API key.

Contents

§1 — Foundations: why structure beats prose
§2 — Single-prompt techniques (11)
§3-§8 — Orchestration · prompts-as-code · escalation · context pipeline · agent tools · sources pro

§1 · foundations

Why structure beats prose

A model that gets a wall of prose has to infer three things at once: who it's playing, what it's optimizing for, and what shape the output should take. A model that gets a frame only has to fill them. The R-G-C-B-T-S frame — Role · Goal · Context · Bounds · Task · Success — is one such frame, but the deeper rule is just: separate the slots so the model can't confuse them.

The rest of this guide is the toolkit you slot into that frame: techniques that change how the model reasons (chain-of-thought, self-consistency), how it acts (ReAct, reflexion), and how multiple models compose into a system bigger than any single context window.

Try the frame: open the free composer →

§2 · single-prompt techniques

Eleven techniques worth knowing by name

Ordered roughly by complexity. The first two are baselines every later technique is measured against. Each card cites a single primary source — read it before you commit to the pattern in production.

Zero-shot prompting

Brown et al. · 2020

Asking a model to perform a task using only a natural-language instruction, with no worked examples. Works well on tasks the base model has seen heavily in pre-training; degrades on tasks with unusual formats or domain conventions.

When to use: The task is common (summarize, translate, classify into obvious labels) and you want the shortest possible prompt.

Classify the sentiment of this review as positive, negative, or neutral:
"{review}"

Language Models are Few-Shot Learners ↗

Few-shot prompting · in-context learning

Brown et al. · 2020

Include a small number of input/output demonstrations in the prompt so the model infers the task pattern at inference time, with no weight updates. Sensitive to example selection, ordering, and label distribution; benefits saturate quickly past ~8-16 examples for most tasks.

When to use: Output format is non-obvious (custom JSON, niche labels, particular tone) and you can show 2-8 representative examples.

Q: 2+2   A: 4
Q: 7+5   A: 12
Q: 13+9  A:

Language Models are Few-Shot Learners ↗

Chain-of-Thought (CoT)

Wei et al. · 2022

Prompt the model to produce intermediate reasoning steps before its final answer, typically by showing exemplars that include such steps. Gains report mainly on arithmetic, commonsense, and symbolic reasoning, and only at sufficient model scale — smaller models can be neutral or hurt by it.

When to use: Multi-step arithmetic, logic, or planning on a capable model where you can tolerate longer outputs.

Q: Roger has 5 balls. He buys 2 cans of 3 balls each. How many balls?
A: Roger starts with 5. 2 cans of 3 is 6. 5 + 6 = 11. Answer: 11.

Chain-of-Thought Prompting Elicits Reasoning ↗

Self-Consistency

Wang et al. · 2022

Sample multiple chain-of-thought reasoning paths at non-zero temperature, then take a majority vote over the final answers instead of using a single greedy decode. Improves reasoning accuracy but multiplies inference cost by the sample count.

When to use: Reasoning tasks where the answer is a short discrete value and you can afford N parallel completions.

# Run the same CoT prompt N times at temperature=0.7
final_answer = majority_vote([extract(c) for c in completions])

Self-Consistency Improves CoT Reasoning ↗

Tree-of-Thoughts (ToT)

Yao et al. · 2023

Generalizes chain-of-thought into a search tree where the model proposes multiple candidate “thoughts” at each step, evaluates them, and expands or backtracks. Requires an explicit controller (BFS/DFS plus a value function) wrapped around the model.

When to use: Problems where solutions need exploration and the model can reasonably evaluate its own partial progress.

Step:  "Propose 3 next moves toward the goal. One-line rationale each."
Eval:  "Rate each candidate sure / maybe / impossible toward solving {task}."
Loop:  expand top-k, prune the rest, repeat until depth D.

Tree of Thoughts: Deliberate Problem Solving ↗

ReAct

Yao et al. · 2022

Interleaves reasoning traces (Thought:) with tool-use actions (Action:) and their results (Observation:) in a single loop, so the model can ground reasoning in retrieved facts.

When to use: Tasks where the model must look things up or operate on an environment rather than answer from parametric knowledge alone.

Thought: I need the population of Lyon.
Action: search("Lyon population")
Observation: ~522,000 (2023)
Answer: ~522,000

ReAct: Synergizing Reasoning and Acting ↗

Reflexion

Shinn et al. · 2023

After an agent attempt fails (per an external evaluator or unit test), the model writes a short natural-language critique of what went wrong and stores it in an episodic memory that is prepended to the next attempt.

When to use: Agentic loops with a checkable verdict per attempt (passing tests, reaching a goal state) and budget for multiple tries.

attempt_1 -> fail
reflect:   "I assumed the index was 1-based; it is 0-based."
attempt_2: <prior reflections + original task>

Reflexion: Verbal Reinforcement Learning ↗

Role / persona prompting

Anthropic docs

Set a role for the model in the system prompt to focus its tone, vocabulary, and default behaviors. Even a single sentence shifts behavior measurably — but role prompting does not unlock new capabilities.

When to use: You want a consistent voice or domain framing across many turns without restating it each message.

system: You are a senior SRE running a calm, blameless post-mortem.
        Prefer concrete timelines and falsifiable claims.
user:   Here is the incident transcript: ...

Anthropic — Prompting best practices ↗

Chain-of-Density

Adams et al. · 2023

Iteratively rewrites a summary at fixed length, each pass adding 1-3 previously missing salient entities without growing the word count, so the final summary is denser and more abstractive.

When to use: Producing a short, information-dense summary of a long article when entity coverage matters more than narrative flow.

Summarize the article in 5 sentences. Then repeat 4 more times.
Each iteration: keep length identical, add 1-3 missing salient
entities, remove filler. Output all 5 versions as a JSON list.

From Sparse to Dense: Chain of Density ↗

Structured output · JSON schema mode

OpenAI · 2024

Constrains decoding so the response is guaranteed to be valid JSON matching a supplied JSON Schema. Enforces adherence at the decoding layer — not by prompt instruction.

When to use: Any downstream code path that needs to JSON.parse the response or hand it to a typed function. Load-bearing for multi-agent systems.

response_format = { "type": "json_schema",
  "json_schema": { "name": "Ticket", "strict": true, ... } }

OpenAI — Structured Outputs in the API ↗

Prompt chaining

Anthropic docs

Split a task into a sequence of separate model calls, where each call's output feeds the next. The most common pattern is self-correction — draft → critique → revise.

When to use: The pipeline has distinct stages with different instructions, you want to evaluate or gate between them, or a single prompt is hitting context/quality limits.

call_1: extract claims from {doc} -> claims.json
call_2: for each claim, verify against {source} -> verdicts.json
call_3: write report using verdicts.json

Anthropic — Chain complex prompts ↗

§3-§8 · pro

Orchestration · prompts-as-code · escalation · context pipeline · agent tools · sources

Seven orchestration patterns (orchestrator/worker, sequential, parallel, debate, evaluator-optimizer, hierarchical, reflection), provider-specific system-prompt conventions across Anthropic / OpenAI / Gemini SDKs, the single→multi-agent escalation checklist with Anthropic's ~15× cost data, how to budget the context window like RAM (the model behind the Context tool), how to write function-calling tools the model will actually use (the model behind the Tool Builder), and curated primary-source references.

Sections §3-§8 are part of the Pro plan ($9/month).

See pricing Sign in →

Free composer is open-source at github.com/deokman420/prompt-composer ↗.

Knowledge base — a working field guide

Prefer to ask instead of read? Open AI Help → — a Pro chat that answers from this knowledge base on your own API key.

§1 · foundations

Why structure beats prose

Try the frame: open the free composer →

§2 · single-prompt techniques

Eleven techniques worth knowing by name

Zero-shot prompting

Brown et al. · 2020

When to use: The task is common (summarize, translate, classify into obvious labels) and you want the shortest possible prompt.

Classify the sentiment of this review as positive, negative, or neutral:
"{review}"

Language Models are Few-Shot Learners ↗

Few-shot prompting · in-context learning

Brown et al. · 2020

When to use: Output format is non-obvious (custom JSON, niche labels, particular tone) and you can show 2-8 representative examples.

Q: 2+2   A: 4
Q: 7+5   A: 12
Q: 13+9  A:

Language Models are Few-Shot Learners ↗

Chain-of-Thought (CoT)

Wei et al. · 2022

When to use: Multi-step arithmetic, logic, or planning on a capable model where you can tolerate longer outputs.

Q: Roger has 5 balls. He buys 2 cans of 3 balls each. How many balls?
A: Roger starts with 5. 2 cans of 3 is 6. 5 + 6 = 11. Answer: 11.

Chain-of-Thought Prompting Elicits Reasoning ↗

Self-Consistency

Wang et al. · 2022

When to use: Reasoning tasks where the answer is a short discrete value and you can afford N parallel completions.

# Run the same CoT prompt N times at temperature=0.7
final_answer = majority_vote([extract(c) for c in completions])

Self-Consistency Improves CoT Reasoning ↗

Tree-of-Thoughts (ToT)

Yao et al. · 2023

When to use: Problems where solutions need exploration and the model can reasonably evaluate its own partial progress.

Step:  "Propose 3 next moves toward the goal. One-line rationale each."
Eval:  "Rate each candidate sure / maybe / impossible toward solving {task}."
Loop:  expand top-k, prune the rest, repeat until depth D.

Tree of Thoughts: Deliberate Problem Solving ↗

ReAct

Yao et al. · 2022

Interleaves reasoning traces (Thought:) with tool-use actions (Action:) and their results (Observation:) in a single loop, so the model can ground reasoning in retrieved facts.

When to use: Tasks where the model must look things up or operate on an environment rather than answer from parametric knowledge alone.

Thought: I need the population of Lyon.
Action: search("Lyon population")
Observation: ~522,000 (2023)
Answer: ~522,000

ReAct: Synergizing Reasoning and Acting ↗

Reflexion

Shinn et al. · 2023

When to use: Agentic loops with a checkable verdict per attempt (passing tests, reaching a goal state) and budget for multiple tries.

attempt_1 -> fail
reflect:   "I assumed the index was 1-based; it is 0-based."
attempt_2: <prior reflections + original task>

Reflexion: Verbal Reinforcement Learning ↗

Role / persona prompting

Anthropic docs

When to use: You want a consistent voice or domain framing across many turns without restating it each message.

system: You are a senior SRE running a calm, blameless post-mortem.
        Prefer concrete timelines and falsifiable claims.
user:   Here is the incident transcript: ...

Anthropic — Prompting best practices ↗

Chain-of-Density

Adams et al. · 2023

Iteratively rewrites a summary at fixed length, each pass adding 1-3 previously missing salient entities without growing the word count, so the final summary is denser and more abstractive.

When to use: Producing a short, information-dense summary of a long article when entity coverage matters more than narrative flow.

Summarize the article in 5 sentences. Then repeat 4 more times.
Each iteration: keep length identical, add 1-3 missing salient
entities, remove filler. Output all 5 versions as a JSON list.

From Sparse to Dense: Chain of Density ↗

Structured output · JSON schema mode

OpenAI · 2024

Constrains decoding so the response is guaranteed to be valid JSON matching a supplied JSON Schema. Enforces adherence at the decoding layer — not by prompt instruction.

When to use: Any downstream code path that needs to JSON.parse the response or hand it to a typed function. Load-bearing for multi-agent systems.

response_format = { "type": "json_schema",
  "json_schema": { "name": "Ticket", "strict": true, ... } }

OpenAI — Structured Outputs in the API ↗

Prompt chaining

Anthropic docs

Split a task into a sequence of separate model calls, where each call's output feeds the next. The most common pattern is self-correction — draft → critique → revise.

When to use: The pipeline has distinct stages with different instructions, you want to evaluate or gate between them, or a single prompt is hitting context/quality limits.

call_1: extract claims from {doc} -> claims.json
call_2: for each claim, verify against {source} -> verdicts.json
call_3: write report using verdicts.json

Anthropic — Chain complex prompts ↗

§3-§8 · pro

Orchestration · prompts-as-code · escalation · context pipeline · agent tools · sources

Sections §3-§8 are part of the Pro plan ($9/month).

See pricing Sign in →