Claude on WholeTech network
home/products/thinking
Deep reasoning

Extended Thinking shipping

Give Claude an explicit thinking budget on hard problems. Better answers on reasoning, math, and code.

01 — What it is

Slow Claude down on hard things.

On hard problems — multi-step reasoning, tricky math, gnarly code — extended thinking gives Claude an explicit budget of thinking tokens before it produces the user-visible answer. The thinking trace is returned alongside the answer so you can inspect or hide it. Trades latency and tokens for quality.

02 — When it pays off

The decision rule.

Worth turning on

Multi-step reasoning. Large refactors. Hard math. Ambiguous specs you want Claude to resolve carefully. Anything where a careless answer is worse than a slow one.

Skip it

Classification. Simple lookups. UI generation. Anything where Claude already nails the answer in a single pass.

03 — The shape

Pass a budget.

msg = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type":"enabled","budget_tokens":2000},
    messages=[{"role":"user","content": HARD_QUESTION}],
)
# msg.content has both `thinking` blocks and `text` blocks.
# `thinking` is for inspection; show only `text` to end users.

The model decides how much of the budget to actually use. A 2,000-token budget is not a guarantee of 2,000 thinking tokens — just an upper bound.

04 — UX

Show or hide the trace.

05 — Pairs well with

Combine.