Prompt Engineering: The Complete Guide (Tree of Thoughts, Chain of Thought, and Beyond)

By Kristian Fagerlie · 2026-05-07 · 4 min read

Prompt engineering is the practice of designing inputs that reliably steer LLMs toward useful outputs. The All About AI archive covers nearly every prompt engineering technique that mattered from GPT-3 to Claude Opus 4.7 — Chain of Thought, Tree of Thoughts, system prompts, reverse prompt engineering, the Ultimate Solver Prompt, and dozens of practical patterns. This guide is the index.

If you are new, start at the top and work down. If you are experienced, jump to the section matching your current problem.

What Is Prompt Engineering?

Prompt engineering is the engineering discipline of writing instructions for LLMs that produce reliable, useful, repeatable outputs. It covers:

System prompts — the instructions that set the model's role and behavior across a conversation
Few-shot prompting — including example input/output pairs to teach pattern by demonstration
Reasoning techniques — Chain of Thought, Tree of Thoughts, Self-consistency, and others
Decomposition — breaking complex tasks into smaller well-defined steps
Reverse prompt engineering — extracting the prompt that produced a given output

The full beginner-friendly intro is in 5 best prompt engineering tips for beginners.

Core Reasoning Techniques

Chain of Thought (CoT)

The simplest reasoning technique: ask the model to "think step by step" before answering. CoT measurably improves accuracy on math, logic, and multi-step problems. Detailed walkthrough with code examples in ChatGPT prompt engineering: Chain of Thought.

Tree of Thoughts (ToT)

An evolution of CoT that explores multiple reasoning paths in parallel, evaluates each, and selects the best. This is the technique that turned GPT-4 from "good" to "scary good" on hard reasoning tasks. Two posts cover this:

ChatGPT-4: How to Use the Tree of Thoughts Method — the technique explained with examples
The Tree of Thoughts Prompt Template — the actual template you can copy and use

The Ultimate Problem Solver Prompt

A composite prompt I built that combines CoT, decomposition, and self-evaluation. It is genuinely the prompt I reach for first on hard problems. See the ultimate problem solver prompt and the broader collection at All About AI ultimate solver prompt.

The "Let's think about this" Prompt

A simple but high-leverage variation that often outperforms more complex CoT setups. Walk-through here.

System Prompts

System prompts are the most under-used tool in most people's prompt engineering toolkit. The full guide is ChatGPT / GPT-4 system prompt engineering — the ultimate guide. It covers role-setting, constraint specification, output format enforcement, and the persistent-context patterns that make agents reliable.

Few-Shot and Zero-Shot Prompting

The classic "show vs tell" tradeoff. Zero-shot is fastest but least reliable; few-shot is more verbose but dramatically better on edge cases. Detailed comparison with examples: prompt engineering tips: zero, one, and few-shot prompting.

Specialized Techniques

The "AI Critic" Prompt

A two-step pattern: have the model generate a draft, then a separate "critic" prompt evaluates and improves it. Full walkthrough here.

Reverse Prompt Engineering

Given an output, work backward to discover what prompt produced it. Useful for stealing patterns from competitor outputs. See master reverse prompt engineering with ChatGPT.

The "Sequence Prompt"

Explicitly numbered, ordered instructions that map to a workflow. Often the difference between "Claude tries" and "Claude reliably ships." Sequence prompt walkthrough.

The "Rate This" Prompt

A self-evaluation pattern where the model rates its own output, often surfacing flaws it would otherwise leave in. The Rate This prompt details.

The Jug Problem Prompt

A canonical hard reasoning test for LLMs. Full walkthrough here.

The "Make Strange Money" Prompt

A creative-thinking pattern that pushes the model toward unconventional ideas. Walkthrough.

Why Larger Context Windows Matter

Context window size changes what prompt patterns are even possible. Full analysis.

Practical Prompt Engineering for Agents

For autonomous agents (Claude Code, AutoGPT, custom systems), prompt engineering looks different. Agent system prompts are denser, tool descriptions matter as much as the main prompt, and decomposition is critical:

Generative Image Prompts

Prompt engineering applies to image models too — Midjourney, Stable Diffusion, DALL·E 3. The patterns are different (visual descriptors, style anchors, negative prompts) but the discipline is the same:

Career: Becoming a Prompt Engineer

Prompt engineering as a job category is fading — modern roles are "AI engineer" or "AI researcher" — but the skills are central. Two posts on the path:

Quick Reference: Which Technique When?

Problem type	Best technique
Math, logic, multi-step	Chain of Thought
Hard reasoning with multiple paths	Tree of Thoughts
Pattern matching from examples	Few-shot prompting
Persistent role across long conversation	System prompt
Quality matters more than speed	Ultimate Solver + AI Critic
Self-evaluation needed	Rate This pattern
Stealing patterns from a target output	Reverse prompt engineering
Agent tool calling	Sequence prompt + clear tool descriptions

Resources

Claude Code guide — modern agentic prompting in the wild
AI agents complete guide — agent-side prompting
My GitHub — code samples
All About AI YouTube channel

FAQ

When should you use Chain of Thought vs Tree of Thoughts?

Use Chain of Thought for any multi-step problem — it's cheap and effective. Switch to Tree of Thoughts only when the problem has multiple plausible solution paths and you need to compare them, since ToT uses dramatically more tokens.

What's the most common prompt engineering mistake?

Vague success criteria. Most prompts that fail aren't poorly worded — they don't specify what 'good output' looks like in concrete terms (length, format, style, what to avoid). Include explicit success rules in your prompt.

Do prompts that work in GPT-4 also work in Claude?

Often, but not always. Claude tends to follow constraints more literally and benefits from XML-style structured prompts, while GPT-4 is more forgiving of natural language. Test both rather than assuming portability.

How long should a system prompt be?

Whatever it takes to constrain behavior reliably — there is no fixed length rule. Claude and GPT-4 both handle 2-5 KB system prompts well. The cost is per-call latency and token usage, not quality degradation.

Should you include examples in every prompt?

Include examples whenever the task has a specific output format, an unusual style, or edge cases the model might miss. Skip examples for simple summarization or general Q&A where zero-shot performance is reliable.

What's the difference between a prompt and a system prompt?

A user prompt is your specific request for one turn; a system prompt persists across the entire conversation, setting the model's role, constraints, and behavior. System prompts have stronger steering power and are harder for users to override.