Claude on WholeTech network
Async & cheaper

Batch API shipping

Submit large jobs asynchronously, pick them up later, pay roughly half the price of synchronous calls.

01 — Why it exists

Half-price, no rush.

Some workloads don't need answers in real time — overnight backfills, classification of months of records, summary generation across an archive. The Batch API lets you submit a list of message-create requests, walk away, and pick them up when the batch finishes. Cost is about 50% of synchronous calls.

02 — When to use it

The decision rule.

Reach for batch when…

Latency doesn't matter. The job is > 100 calls. You can wait minutes-to-hours for results. You care about cost.

Skip it when…

The user is waiting. Total volume is tiny. You need to chain results together (one call's output feeds the next) — that's a synchronous loop.

03 — The shape

Requests in, results out.

  1. Build a JSONL file, one request per line, each with a custom_id you'll use to match results back.
  2. Submit the batch. The API returns a batch_id.
  3. Poll status until it's ended.
  4. Download results. A JSONL file in the same order, each row keyed by your custom_id.
04 — Patterns

What batch is great for.

05 — Operational habits

Don't lose a batch.