Cloud setup — analysis, alternatives & what we left out

Part 1

The blocks, re-examined

The guide is three self-contained blocks, not a six-step spine. Each made one choice for clarity. Here's the reasoning and the roads not taken.

A cloud VM is just a Linux server

We chose: create an always-on Ubuntu VM in a provider's console, then treat it exactly like the Linux guide.

Why: sameness is the whole point — once you SSH in there's no special "cloud" version of these tools, so the Linux guide is your real manual and the cloud is just the host.

Alternatives worth knowing

Spot / preemptible instances — the same VM at a steep discount (often 60–90% off) in exchange for the provider being able to reclaim it. Great for batch jobs and GPU experiments where an interruption is fine; bad for anything that must stay up. The single biggest lever on cost.
Managed container services instead of a raw VM — Cloud Run (GCP), Fargate (AWS), Container Apps (Azure). You hand them a container and they run it; no OS to patch, scales to zero when idle. Less "feels like a Linux box," more "deploy and forget."
Which provider actually matters: DigitalOcean is the simplest and most predictable pricing; GCP / AWS / Azure give you the managed AI backends (Block 2) in the same account, so if you'll use Vertex AI or Bedrock anyway, start there.
Because the VM is Linux, everything in the Linux guide applies verbatim — Claude Code, Ollama, Tailscale. The cloud part ends the moment you connect.

Managed model backends

We chose: the big-three managed backends (Vertex AI, Bedrock, Azure), plus OpenRouter as the easy on-ramp, and showed how to point Claude Code at Bedrock/Vertex.

Why: these are the paths a business uses to route AI billing and data through its own cloud account.

Alternatives & notes

Direct provider APIs (Anthropic, OpenAI, Google) vs. OpenRouter — direct gives you the cleanest billing and earliest model access; OpenRouter gives you one key that reaches hundreds of models, ideal for comparison-shopping without a signup per vendor. Different tools for "I know what I want" vs. "let me try everything."
Claude Code at a cloud backend is the real tie-in: set CLAUDE_CODE_USE_BEDROCK=1 (plus AWS region + credentials) or CLAUDE_CODE_USE_VERTEX=1 (plus GCP project + region) before launching. That routes the agent's engine through your own cloud account instead of a personal subscription.
Model-agnostic agents like Hermes (Nous Research) are built to point at whichever backend you pick — Bedrock, Vertex AI, OpenRouter, or a direct API — so they pair naturally with this block (see Left out, below).

GPU clouds for big models

We chose: rent a GPU by the hour (RunPod / Lambda / Vast), install Ollama + Tailscale, pull a big model.

Why: occasional serious horsepower without buying a $5,000 card you'd rarely use.

Alternatives & trade-offs

RunPod vs. Lambda vs. Vast — they're not interchangeable. Vast.ai is a marketplace (cheapest, most variable, host machines you don't control); RunPod is the friendly middle with templates and persistent volumes; Lambda leans toward bigger reserved/cluster jobs. For a casual "run a model for an afternoon," RunPod is the gentlest start.
Destroy vs. stop is the trade-off that costs people money: a stopped GPU instance on many clouds still keeps (and bills for) its storage and can be hard to restart on the same hardware. Destroying it returns you to zero but means re-installing next time. For occasional use, destroy — the re-install is a one-liner.
The same VM in Block 1, ordered with a GPU, is a fourth option — heavier to set up, but it lives in your main cloud account alongside the managed backends.

Optional · cloud-native editors

The editor route — cloud-native dev environments

In the cloud the editor question splits in a way it doesn't anywhere else: you can bring the editor (your laptop's VS Code talks to the cloud VM over Remote-SSH), or you can buy the editor as a service (GitHub Codespaces, GitPod, etc. — full IDE in a browser tab, billed by the minute). Both work; they trade ownership for convenience and add a new failure mode the other OS guides don't have: the bill. Here's the honest breakdown of every path that's still alive in May 2026.

The four real paths in the cloud

Path A — VS Code on your laptop + Remote-SSH into the cloud VM. The Linux-guide pattern, lifted to the cloud. Install VS Code locally, install the Remote Development extension pack, Connect to Host… the cloud VM's IP (or Tailscale name). The cloud bill stays at "one VM-hour"; you don't pay extra for the editor.
Path B — GitHub Codespaces. The cloud-native fully-managed answer: GitHub spins up a Linux box, runs full VS Code in your browser against it, persists it between sessions. Real terminal, real extensions, real Copilot, no local install. Bills per active minute; auto-stops when idle. Locks you to GitHub.
Path C — code-server inside the cloud VM you already pay for. Run code-server on the same VM as your agents, reach it in a browser over Tailscale. You're already paying for the VM-hour; the editor adds no new bill. You own everything.
Path D — GitPod, AWS / Google / JetBrains cloud editors. Same idea as Codespaces, different vendor. GitPod is the strongest non-GitHub option; the major-cloud editors come and go (AWS Cloud9 was deprecated in 2024; Google Cloud Workstations and JetBrains Space are still alive). Niche unless your shop standardizes on one.

Set up Path A — laptop VS Code + Remote-SSH

Open SSH on the VM — already done if you followed Block 1. Use SSH key auth and a non-root user (see Part 2 and Part 5).
Install VS Code on your laptop + the official Remote Development extension pack. Add the VM to your ~/.ssh/config for one-click connect.
If the VM is behind Tailscale (recommended), connect by tailnet hostname — no public SSH needed.
Open a real project folder on the VM. Every AI extension scopes context to the open workspace.

Set up Path B — GitHub Codespaces

Open any GitHub repo → Code → Codespaces → Create codespace on main. A full VS Code in a tab boots in 30–60 seconds.
Set the machine size in your account's Codespaces settings — 2-core is plenty for most work; bigger costs more per minute. Cap your monthly spend in Billing → Spending limits before you start.
Set "Default idle timeout" to 30 min (or shorter) in Codespaces settings — the single most important step to prevent runaway billing.
Install Copilot, Cline, or Claude Code for VS Code from the extensions panel; they work normally because Codespaces is a real VS Code.
Add .devcontainer/devcontainer.json to your repo to pin the toolchain, extensions, and post-create scripts — your next codespace boots into a configured environment, not a blank Linux.

Set up Path C — code-server in your cloud VM

SSH into the VM, install code-server: curl -fsSL https://code-server.dev/install.sh | sh.
Bind to 127.0.0.1:8080, not 0.0.0.0: edit ~/.config/code-server/config.yaml and set bind-addr: 127.0.0.1:8080.
Reach it over Tailscale only. Open http://your-vm-tailnet-name:8080 from a tailnet device. Never publish 8080 with a public name; never put it behind a cloud reverse proxy with a public address.
Run it as a non-root user. Enable as a user systemd service (systemctl --user enable --now code-server); never as root.
Install Cline, Continue, or Claude Code for VS Code from the extensions panel — code-server is a full VS Code, so they work.

AI in the cloud editors

Codespaces + Copilot is the most polished combo, but Copilot is a separate subscription on top of the Codespaces minutes — read your bill carefully.
Codespaces + Claude Code for VS Code works fine; the extension talks to Anthropic's API the same way your laptop would. Your Anthropic bill stays separate from the GitHub bill.
code-server + Cline pointed at OpenRouter or Bedrock is the cleanest "your cloud, your bill" setup — Cline uses the cloud account you set up in Block 2, so models, compute, and storage all hit one cloud invoice.
Don't run the heavy local-model loop inside Codespaces. A 2-core codespace is not where you want a 70B-parameter model to live; put Ollama on a GPU instance from Block 3 instead.

Pros — what the cloud editor route gives you

Zero local install. Codespaces and code-server let you start a new project from any browser anywhere.
Per-project environments. Each repo can have its own pinned toolchain (devcontainer.json) — no more "works on my laptop."
Spin up and tear down at will. A bug-bash session on a heavy VM costs $0.50 if you remember to stop it; same on Codespaces, billed by the minute.
Real AI inside, not a sandboxed web build. Codespaces and code-server both run full VS Code extensions; Copilot, Cline, Claude Code all work properly.
Tailscale + code-server = a private cloud editor your team can share without ever exposing a port to the open internet.
Remote-SSH is the lowest-risk on-ramp — same editor as your laptop, just pointed at a cloud box.

Cons — what the cloud editor route costs you

The bill is a new failure mode. Every other OS analysis on this site has "RAM" as the worst-case cost; in the cloud it's dollars. A forgotten Codespace at $0.18/hour costs $130/month if left running 24/7. Set idle timeouts and spending caps before doing anything else.
Vendor lock-in by design. Codespaces ties to GitHub; GitPod ties to GitPod; AWS / GCP cloud editors tie to those clouds. Path A and Path C are the portable answers.
Latency depends on region. A Codespace in us-east from Hot Springs is fine; from Lisbon it's noticeable. code-server in your own VM lets you pick the region.
Data residency. Your code is in someone else's data center. For private repos that's fine; for regulated work it's a real conversation with legal.
Extension supply-chain risk is bigger in the cloud — a malicious extension in a Codespace can read everything the Codespace can read, including any cloud credentials baked into the dev container.
Cold starts. Codespaces takes 30–60 seconds to boot from cold; code-server is instant once the VM is running but you still pay for VM uptime.
It's not designed AI-first. Cursor and Windsurf don't have a "Cursor Codespaces" offering yet — if you want editor-first AI in the cloud, you mostly run Cursor on your laptop and Remote-SSH into the cloud VM (Path A).

When to pick which

Stay terminal-only inside the VM if: the work is mostly autonomous agent runs and you want the cheapest, simplest, most defensible cloud bill. Most agent-heavy workflows don't need an editor at all.
Use Path A (Remote-SSH from your laptop) if: you already pay for a VM and want one editor that follows you between local and cloud projects. The most portable, lowest-bill option.
Use Codespaces if: the project is already on GitHub, you want zero-setup onboarding for collaborators, or you're working from a Chromebook / iPad / locked-down work machine. Watch the bill.
Use code-server in your VM if: you want Codespaces' UX without GitHub lock-in, or your work is on a private cloud (Bedrock, Vertex) where Codespaces would add a third invoice.
Use GitPod / cloud-vendor editors if: your team standardizes on one. Otherwise the smaller they are, the higher the risk of being deprecated like Cloud9.
Don't: install a desktop GUI on a cloud VM just to run VS Code there. Use Remote-SSH or code-server instead — never X11 over the internet.

☁The honest take: the cloud is the one platform where "editor" and "bill" are the same conversation. Codespaces is delightful and dangerous in equal measure — set the idle timeout and the spending cap before the first session. Path A (laptop + Remote-SSH) and Path C (code-server in your VM) are the calm, portable, predictable answers; Codespaces is the answer when convenience genuinely beats ownership. Pick one for a quarter, watch the bill, then decide.

Part 2

What we left out — and why

The guide is deliberately three clean blocks. That clarity has a cost: real omissions — and in the cloud, the omissions are mostly the things that protect your wallet and your data. Here they are, honestly, with the reason each was cut.

Left out	What it is	Why it was cut
Hermes Agent	Nous Research's model-agnostic coding agent — points at Bedrock, Vertex, OpenRouter, or a direct API	Was an oversight in v1. Now noted on the cloud page as the natural pairing for managed backends. A good reminder the big-three SDKs aren't the whole field.
Cost controls / budget alerts	Spending caps and email alerts you set in the provider console	The #1 real cloud mistake is a runaway bill. The guide warns about it in prose; it deserved to be its own step. This is the single most important thing on the whole topic — see Security below.
IAM least-privilege roles	Giving each user/tool only the permissions it needs, instead of full owner access	Cut to keep first-login simple. But daily work on root/owner credentials is how one leaked key becomes a total account takeover.
Security groups / firewall rules	The cloud's per-VM firewall that decides which ports are reachable from the internet	The provider's defaults are often more open than you'd want. Skipped for clarity, but leaving SSH or Ollama open to the world is a real, common exposure.
Cloud secret managers	Secret Manager (GCP), Secrets Manager / Parameter Store (AWS), Key Vault (Azure)	The guide puts keys in plaintext env vars to stay simple. A managed secret store is the grown-up answer — encrypted, audited, rotatable.
The privacy / data-residency tradeoff	What it means that your code and prompts leave your premises when you call a managed backend	Glossed over. Routing through Bedrock/Vertex is great for billing — but your data now travels to and is processed in someone else's data center. A real tradeoff, not a free lunch.
Remembering to DESTROY GPU instances	Tearing down (not just stopping) the hourly GPU when you're done	Warned in Block 3, but worth repeating as its own line: a "stopped" instance can keep charging. Destroy it to stop paying and shrink your attack surface.

🧭The pattern: we cut anything that wasn't needed to get a working cloud setup off the ground. The price is that almost every omission is a guardrail — budgets, IAM, firewalls, secret managers. In the cloud, "production-ready" is mostly about putting those guardrails back. This analysis page is where that depth lives.

Part 3 · what the AI crowd is actually using

Trending & cool — tools worth a look

Scanning the developer conversation on X and GitHub in May 2026, here's what's hot that the guide doesn't yet mention. All run on Windows (most happiest in WSL2).

OpenCode

The open-source CLI agent everyone's talking about — 150K+ stars, ~6.5M monthly devs. LSP integration, multiple parallel sessions, shareable session links. The strongest "free, bring-your-own-model" alternative to Claude Code.

Warp 2.0

A terminal that's also an agent cockpit — runs Claude Code, Codex, and others in one windowed UI with panes. Nice if the bare terminal feels stark.

Goose · OpenHands

Goose (from Block) and OpenHands are open-source autonomous agents that take a goal and run a long multi-step job. The frontier of "set it and walk away."

GitHub Spec Kit

93K+ stars. A "spec-driven development" workflow that teaches any agent (Claude Code, Copilot, Gemini, etc.) to plan before it codes. Tessl and Kiro play in the same space.

MCP servers

The plug-ins that matter: chrome-devtools-mcp (let an agent drive Chrome), filesystem, GitHub, database connectors. This is the fastest-moving, highest-leverage area right now.

Qwen 3.6-Plus (local)

An agentic open model with a 1M-token context and MCP-native tool use — a serious local option for Ollama if your hardware can handle it.

⚖️Why these aren't in the main guide: they're powerful but move fast, need more setup, or assume comfort with the basics. The guide's job is to get you to "it works." This list is your "now go further." Honorable mentions from the same conversation: Amazon Q Developer CLI, Sourcegraph Amp, Qwen Code, Crush, Plandex, Kimi CLI, Aider, Cline.

Part 4 · the next wave

Getting ready for Mythos

Mythos is Anthropic's first model specialized for one domain: defensive cybersecurity. Announced April 7 2026 as the engine of Project Glasswing, it has already found a 27-year-old vulnerability in OpenBSD and bugs in FFmpeg. It is invitation-only ($25 / $125 per million tokens), shipped to 12 founding orgs and 40+ critical-infrastructure partners — not a download. Full briefing →

So "getting ready" isn't an install — it's preparing your environment so that when domain-specialized models (Mythos and the wave behind it) open up, you can point them at something useful:

The cloud is where these models actually live. Specialized models like Mythos are hosted on managed backends — Vertex AI and Bedrock — and heavy security analysis runs at scale on cloud compute, not a laptop. So cloud readiness (IAM, budgets, MCP wired to your repos) is the most direct path to using Mythos-class models the moment access opens. The blocks on this very page are the on-ramp.
Get on MCP now. Specialized models reach your code and infrastructure through MCP. A working MCP setup today is the plug Mythos-class tools will use tomorrow.
Have your code in Git, reachable over Tailscale. A security model is only useful pointed at your systems. Repos in Git + machines on your tailnet = ready to analyze.
Tighten your own security first (Part 5). The irony: you can't safely run a vulnerability-finding model on a sloppy machine. Hardening is the prerequisite, not the reward.
Keep a clean Anthropic account & current Claude Code. Access to new models lands through the same account and tooling you already use.
Watch Glasswing. If you run anything critical (a business, rentals, client sites), the initiative is where defensive tooling will surface first.

🔭Honest take: as an individual you won't get Mythos itself soon. What you can do is build the habits — MCP, Git, a private network, a hardened cloud account — that make any future specialized model immediately useful. Because these models run in the cloud, the cloud setup is the most direct preparation of all (see the tool map).

Part 5 · don't skip this

Securing cloud compute — the part most guides skip

In the cloud you're renting machines that bill by the hour, holding credentials that can spend money, and exposing services to the public internet by default. That's a lot of ways to get hurt. Here's how to keep it from biting you — cloud specifics first, then the universal rules that apply everywhere.

⚠Real incident (Feb 2026): Check Point Research disclosed that a malicious config could redirect Claude Code's traffic via the ANTHROPIC_BASE_URL setting and exfiltrate your API key in plaintext. Anthropic patched it before disclosure — the lesson stands: keep Claude Code updated, install only from official sources, and be suspicious of any config that reroutes where a tool "phones home." This matters double in the cloud, where that key may be a cloud credential that can spin up paid resources.

#1: bill shock — set budget alerts FIRST

The most common cloud "incident" is a runaway bill, not a breach. Before you launch anything, go into the provider console and set a budget alert and, where available, a spending cap — so you're emailed (or cut off) long before a forgotten GPU or a misconfigured loop becomes a four-figure surprise.
This is the single highest-value habit on the whole topic. Do it on day one, on every account, before the first VM.

IAM — least privilege

Don't use root / owner credentials for daily work. Create a normal user (or service account) with only the permissions the job needs. Owner-level keys turn one leak into a total account takeover — including the ability to rack up bills.
Give helpers and tools their own scoped identities, never your master credentials.

Security groups & firewall — close the public ports

Never open SSH (port 22) or Ollama (port 11434) to 0.0.0.0 — that's the whole internet. The provider's default firewall is often more open than you'd want; tighten it.
Put the VM on Tailscale and reach it over the tailnet, then close the public ports entirely. The machine becomes invisible to internet scanners while staying fully reachable to you. There's no password on Ollama by default — exposing 11434 publicly is handing your model server to anyone.

Use the cloud's secret manager, not plaintext env

The guide's export … env vars are fine to learn with, but a key sitting in plaintext on the box leaks the moment that box is compromised or the shell history is read.
Store keys in the cloud's secret manager — Secret Manager (GCP), Secrets Manager / Parameter Store (AWS), Key Vault (Azure). Encrypted, access-audited, rotatable without redeploying.

DESTROY GPU instances — don't just stop them

When a GPU job is done, destroy the instance, not merely stop it. On many GPU clouds a "stopped" instance keeps its storage and can keep charging — and a parked machine is still attack surface.
Destroying it both stops the meter and removes one more thing that can be broken into. Re-installing next time is a one-liner.

Encryption & the privacy tradeoff

Turn on encryption at rest for your disks/volumes (most providers offer it as a checkbox) so data on a reclaimed or lost disk isn't readable.
Understand the real tradeoff of managed backends: routing Claude Code or any agent through Bedrock / Vertex / OpenRouter means your code and prompts leave your premises and are processed in someone else's data center. Great for billing and scale — but a genuine privacy and data-residency decision, especially for client or regulated data. Decide it on purpose.

Universal rules — keys, leash, supply chain

Never paste API keys into files you might share or commit. Anthropic + GitHub run secret-scanning that auto-deactivates leaked Claude keys — helpful, but don't rely on it. Rotate a key the moment you suspect it leaked; give helpers their own keys, never yours.
Keep the agents on a leash. Configure Claude Code to deny reading .env files, SSH keys, .secrets, and certificates — and not to read its own config. Start in ask-before-acting mode on a new box; don't run "auto/yolo" modes in a folder full of irreplaceable files. Work inside a project folder, not the whole disk.
Supply chain: npm install -g and curl … | sh run other people's code. Use only the exact official sources; don't paste install one-liners from random blogs or X replies. Keep everything updated.
Ollama behind Tailscale: setting OLLAMA_HOST=0.0.0.0:11434 exposes your model server to the network — only do it behind Tailscale, never a public IP.
Tailscale ACLs: in the admin console restrict who can reach port 11434 and who can SSH where. Turn on key expiry so a lost device drops off. Tailscale's new Aperture (alpha) can keep API keys off your devices entirely, behind your tailnet identity.

✓ Good shape when: a budget alert is set before anything launches, daily work runs on a least-privilege identity (not owner), no public SSH or 11434 ports, the VM is Tailscale-only with ACLs + key expiry, keys live in a secret manager (not plaintext), disks are encrypted, GPU instances get destroyed when done, and you've made the data-residency call on purpose. That's a cloud account you can hand a vulnerability-finding model without flinching.

The cloud setup, under the microscope.