Knowledge & Advanced ⚡ Intermediate

AI Reasoning Techniques Explained: Chain-of-Thought, ReAct, and Tree of Thoughts

How did AI learn to 'think'? From Chain-of-Thought to Tree of Thoughts, discover the techniques that boost AI reasoning by 5x, and how OpenClaw uses them.

📝 建立:2026年2月27日 ✅ 最後驗證:2026年2月27日
鴨編 卡住很正常——點段落旁的 😵 卡關 讓我們知道,或直接往下滾到問答區發問。 也可以用 👍 看懂 / 😢 看不懂 告訴我們哪裡寫得好、哪裡要改。

AI Can “Think Wrong” Too

Have you ever tried asking AI to solve a math problem?

You: Roger has 5 tennis balls. He buys 2 more cans, each containing 3 balls.
     How many tennis balls does he have now?
AI:  11

The answer is correct… but try a slightly more complex problem:

You: A classroom has 23 students. 5 students leave,
     then 3 new groups arrive, with 4 students in each group. How many students are there now?
AI:  26

Wrong. The answer should be 23 - 5 + (3×4) = 30 students.

When AI jumps straight to the answer, it tends to “skip the thinking” — just like a student who skips showing their work on an exam and writes an answer by gut feeling, often getting it wrong.

The solution: Force AI to show its work.


Chain-of-Thought: Making AI Show Its Reasoning

The Core Concept

In 2022, Google’s research team discovered: If you ask AI to “think step by step” in the prompt, its reasoning ability can improve by 3-5x.

This technique is called Chain-of-Thought (CoT).

Side-by-Side Comparison

❌ Standard prompt:

A classroom has 23 students. 5 leave,
then 3 new groups of 4 students each arrive. How many students are there now?

AI answers: 26 (Wrong)

✅ CoT prompt:

A classroom has 23 students. 5 leave,
then 3 new groups of 4 students each arrive. How many students are there now?
Let's work through this step by step.

AI answers:

Let me calculate step by step:
1. Originally 23 students
2. 5 leave → 23 - 5 = 18 students
3. 3 groups arrive, 4 per group → 3 × 4 = 12 students
4. 18 + 12 = 30 students
The answer is 30 students.

✅ Correct.

Why Does It Work?

Duck Editor Analogy: You ask a student to solve a math problem. They calculate in their head and blurt out “26.” You ask them to write down their work, and at step two they catch their own mistake. The written process itself is a self-correction mechanism.

Technically, CoT works because:

  • Breaks down complex problems: One big problem becomes multiple small ones, each step is a simple problem
  • Intermediate steps are checkable: Errors can be caught at a specific step
  • Leverages the model’s sequential generation: Each generated token influences the next token

Three Ways to Use CoT

1. Zero-shot CoT: The Simplest Approach

No examples needed — just add one sentence:

[Your question]
Let's think step by step.

That’s it. Adding this one sentence boosted math and logic accuracy from 17.7% to 78.7% — that’s from Google’s paper, not made up.

2. Few-shot CoT: With Examples

Give AI 2-3 examples that include the reasoning process, so it learns the format:

Question: Mike has 3 apples. Jane gives him 5 more, and he eats 2. How many are left?
Reasoning:
- Started with 3
- Plus 5 → 3 + 5 = 8
- Ate 2 → 8 - 2 = 6
Answer: 6

Question: [Your new question]
Reasoning:

More stable than Zero-shot, especially when you know the correct reasoning approach.

3. Self-Consistency: Majority Vote

Ask the same question and have AI answer 5 times using CoT, then see which answer appears most often — majority rules.

Answer 1: 300 (reasoning path A)
Answer 2: 360 (reasoning path B)
Answer 3: 360 (reasoning path C)
Answer 4: 360 (reasoning path D)
Answer 5: 300 (reasoning path E)

Vote result: 360 (3/5 votes) → Use 360

Duck Editor Analogy: When you’re unsure of the answer, ask 5 people and go with the majority. It’s not guaranteed to be right, but chances are better than asking just 1 person.


ReAct: Making AI Think and Act Simultaneously

CoT’s Limitation

CoT teaches AI to reason, but it can only think “in its head” — when it needs to look up data or use tools, it falls short.

“Will it rain in Taipei tomorrow?” AI can reason with CoT, but without weather data, it can only guess.

ReAct = Reasoning + Acting

The 2022 ReAct paper merged reasoning and acting:

Thought → Action → Observation → Thought...

Full workflow:

User: "Will it rain in Taipei tomorrow? Help me decide whether to bring an umbrella."

[Thought] User is asking about Taipei weather. I need to check the forecast for an accurate answer.
[Action]  Call weather.get_forecast(city="Taipei", day="tomorrow")
[Observation] API returns: 80% chance of rain, 18-22°C, northeast wind
[Thought] 80% chance of rain is high — I should recommend an umbrella.
          Temperature is on the cool side, I can suggest a jacket too.
[Answer]  Tomorrow in Taipei there's an 80% chance of rain — bring an umbrella!
          Temperature 18-22°C with northeast winds, so a light jacket would help too.

This Is the Core of OpenClaw’s Agent

OpenClaw’s Agent mode is an implementation of the ReAct loop:

  1. Perceive (receive your instruction)
  2. Think (analyze which Skills are needed)
  3. Act (call a Skill / tool)
  4. Observe (receive the tool’s result)
  5. Back to Think (determine if the task is complete)
You: "Organize my to-do list for today and send it to Telegram"

Agent thinks: Needs two steps — get to-dos + send message
Agent acts: Calls Todoist Skill → Gets today's to-dos
Agent observes: Received 5 to-do items
Agent thinks: Format and send to Telegram
Agent acts: Calls Telegram Skill → Sends formatted list
Agent observes: Sent successfully
Agent replies: "I've sent your 5 to-do items to your Telegram!"

See Agent Complete Guide for details.


Advanced: Tree of Thoughts

From Chain to Tree

CoT is a linear thinking path — from A → B → C → answer.

But some problems don’t have a clear next step and require exploring multiple paths:

Chain-of-Thought (chain):
A → B → C → answer

Tree of Thoughts (tree):
         A
       / | \
      B  C  D
     /|    / \
    E  F  G   H
    ✅     ✅
  (Found two viable paths, pick the best one)

When Do You Need Tree of Thoughts?

Problem TypeCoT Is EnoughNeeds Tree of Thoughts
Math calculation
Translation
Creative writing✅ (need to try different styles)
Strategic planning✅ (need to compare multiple plans)
Debugging code✅ (need to hypothesize multiple causes)

Duck Editor OpenClaw’s swarm intelligence mode is like an automated Tree of Thoughts — multiple Agents each explore different paths, then vote on the best result. See Multi-Agent Collaboration for details.


Reasoning Technique Comparison Table

TechniqueCore ConceptBest ForOpenClaw Application
Zero-shot CoTAdd “think step by step”Everyday questionsPrompt in Skills
Few-shot CoTProvide 2-3 examplesSpecific domainsSOUL.md examples
Self-ConsistencyMultiple answers, majority voteImportant decisionsMultiple runs + comparison
ReActReasoning + action loopTasks requiring toolsAgent core loop
Tree of ThoughtsExplore multiple pathsCreative/strategySwarm intelligence

Practical Recommendations: When to Use What

Daily Use (80% of Cases)

Just add “Let’s think step by step” at the end of your prompt. That’s usually enough. Don’t overthink it.

Complex Tasks

Let OpenClaw’s Agent mode handle it automatically — it has a built-in ReAct loop, so you don’t have to manage it manually.

Important Decisions

Design workflows in your Skills that run multiple times and compare results — that’s the Self-Consistency approach in action.


Further Reading

這篇文章對你有幫助嗎?

💬 問答區

卡關了?直接在這裡問,其他讀者和作者都能幫忙解答。

載入中...