About OpenClaw ✅ Beginner

Token Economics: Understanding How AI Bills You (So Your Invoice Doesn't Explode)

What are tokens? How do different models charge? Learn to precisely control your AI usage costs and get the most value for your money.

📝 建立:2026年2月24日 ✅ 最後驗證:2026年2月24日
鴨編 卡住很正常——點段落旁的 😵 卡關 讓我們知道,或直接往下滾到問答區發問。 也可以用 👍 看懂 / 😢 看不懂 告訴我們哪裡寫得好、哪裡要改。

Why Do You Need to Understand Tokens?

You’ve already applied for an API Key and started using OpenClaw. Then one day, you receive a bill—

“This month’s API usage fee: $47.82”

Wait, I only used it for a few days! How is it already almost $50?

The problem is: you didn’t know what tokens are, and you had no idea how much each API call was costing you.


What Is a Token?

A token is AI’s “unit of measurement,” like kilowatt-hours for electricity.

But tokens ≠ words. AI breaks text into small pieces (called tokens), and each piece might be a word, half a word, or even a punctuation mark.

English vs Chinese

English: "Hello world" → 2 tokens (each word = 1 token)
Chinese: "你好嗎?"    → 4-6 tokens (each character may be 1-2 tokens)

⚠️ Important: Chinese costs about 1.5-2x more than English, because Chinese characters break into more tokens.

Duck Editor In other words: Tokens are like a taxi meter. You’re not charged per trip — the meter ticks with every small increment. The more you say and the longer the AI responds, the more ticks on the meter.

How Tokenization Actually Works

Using OpenAI’s tokenizer as an example:

TextToken CountExplanation
Hello1Common English word = 1 token
你好2-3Each Chinese character ≈ 1-2 tokens
OpenClaw2-3Uncommon compound words get split
2024年2月24日5-7Mixed numbers + Chinese

Duck Editor Try it yourself: Go to OpenAI Tokenizer, paste in your text, and see the actual token count.


Input Tokens vs Output Tokens

Every AI call has two cost components:

Your question    → Input Tokens
AI's response    → Output Tokens

Output tokens are typically 2-4x more expensive than input tokens.

Duck Editor Duck Editor says: Why are outputs more expensive than inputs? Imagine going to a restaurant — ordering a dish (input) only takes a few words, but the chef has to buy ingredients, prep, cook, and plate it (output). Of course the chef’s side costs more.

Cost Structure

Total Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)

Pricing by Model

Major model prices as of early 2026 (per million tokens):

OpenAI

ModelInput PriceOutput PriceBest For
GPT-4o$2.50$10.00Daily conversations, medium-complexity tasks
GPT-4o mini$0.15$0.60Simple tasks, high-volume calls

Google

ModelInput PriceOutput PriceBest For
Gemini 2.0 FlashFree (with quota)Free (with quota)Best for beginners!
Gemini 1.5 Pro$1.25$5.00Long documents, multimodal

Anthropic Claude

ModelInput PriceOutput PriceBest For
Claude 3.7 Sonnet$3.00$15.00Writing, analysis, coding
Claude 3.5 Haiku$0.80$4.00Fast responses, low cost
Claude 3 Opus$15.00$75.00Highest quality, complex reasoning

Duck Editor Beginner recommendation: Start with Gemini 2.0 Flash (free quota). Once you’re comfortable, consider paid models. That’s also why we first teach you to apply for a Google API Key.


Real-World Cost Estimation

Scenario 1: Using OpenClaw Daily to Summarize News

Suppose you have your Agent run a “news summary” once per day:

  • Search + input: ~2,000 tokens
  • Summary + output: ~1,500 tokens
  • Model: GPT-5 Nano
Daily cost = (2,000 × $0.05 / 1M) + (1,500 × $0.40 / 1M)
           = $0.0001 + $0.0006
           = $0.0007

Monthly cost = $0.0007 × 30 = $0.021 (less than 1 TWD)

Less than 1 TWD per month. Using Nano-tier models for simple tasks is incredibly cost-effective.

Scenario 2: Writing 5 Long Articles Per Day

Suppose each article requires substantial context:

  • Input: ~10,000 tokens × 5
  • Output: ~3,000 tokens × 5
  • Model: Claude Sonnet 4.5
Daily cost = (50,000 × $3.00 / 1M) + (15,000 × $15.00 / 1M)
           = $0.15 + $0.225
           = $0.375

Monthly cost = $0.375 × 30 = $11.25 (about 370 TWD)

Upgrading to Claude Opus 4.6 would cost ~$45/month. Downgrading to Gemini 2.5 Flash would bring it down to ~$1.5/month.

Scenario 3: Why Did My Bill Explode?

Common culprits:

CauseSolution
Conversations too long, sending full history every timeLimit conversation memory length
Using flagship models for simple tasksDowngrade to Flash / Nano
A Skill bug causing infinite loopsSet token limits
Forgetting System Prompt also counts as tokensStreamline System Prompt

Money-Saving Tips

1. Choose the Right Model (Most Important)

80% of tasks work fine with the cheapest model.

Simple Q&A, classification, summaries  → GPT-5 Nano / DeepSeek V3
Writing articles, analysis reports     → Gemini 2.5 Flash / Claude Haiku 4.5
Code, complex reasoning                → Claude Sonnet 4.5 / GPT-4.1
Very long documents (>200K tokens)     → Gemini 2.5 Pro / Gemini 3.1 Pro
Best possible quality                  → Claude Opus 4.6

In OpenClaw, you can assign different models to different Skills.

2. Streamline Your System Prompt

The System Prompt is sent with every call — it’s a “hidden cost.”

❌ 500-token System Prompt × 100 calls/day = 50,000 tokens/day
✅ 100-token System Prompt × 100 calls/day = 10,000 tokens/day

That’s an 80% reduction! Check the Soul Configuration Guide to learn how to write concise, effective personas.

3. Use Caching

Both OpenAI and Anthropic support Prompt Caching:

  • Repeated prefix portions are only counted once
  • Can save 50-80% on input token costs
  • OpenClaw has built-in support — no extra configuration needed

4. Limit Output Length

Add max_tokens limits in your Skills:

# In Skill configuration
config:
  max_tokens: 500  # Generate at most 500 tokens in the response

5. Set Budget Alerts

In the OpenAI Dashboard:

  1. Settings → Billing → Usage limits
  2. Set a Hard limit and a Soft limit (alert threshold)

Recommended settings:

  • Soft limit: $5 (alerts you)
  • Hard limit: $20 (forces a stop)

How OpenClaw Saves You Money

OpenClaw has several built-in cost-saving mechanisms:

  1. Model routing: Automatically selects models based on task complexity (configured in Model Settings)
  2. Agent memory compression: Doesn’t send the entire history — only retains the important parts
  3. Skill caching: Identical inputs won’t trigger duplicate calls
  4. Token budgets: Each Skill can have its own limit

FAQ

What if my free credits run out?

  • Google Gemini Flash has a daily free quota (usually enough for personal use)
  • New OpenAI accounts get $5 in free credits (good for trying things out)
  • After that, you’ll need to top up — we recommend starting with $10

My monthly budget is only about $3 USD

Totally doable! With GPT-5 Nano ($0.05/$0.40) or DeepSeek V3 ($0.27/$1.10), typical usage won’t even reach $1/month. If you use OpenRouter’s free models (200 requests/day), it costs almost nothing.

Do tokens expire?

Tokens aren’t prepaid credits — you pay for what you use. Your API balance doesn’t expire (unless it’s a time-limited promotion).


Next Steps

Now that you understand tokens, you can:

這篇文章對你有幫助嗎?

💬 問答區

卡關了?直接在這裡問,其他讀者和作者都能幫忙解答。

載入中...