Here is the problem Guppy exists to solve. I start a Claude Code session working through a real task — a migration, a refactor, a long test run — and then I close the laptop, or the wifi drops, or I switch from the desk to the couch to the train. On a laptop, that session is gone, or paused, or wedged. The work an agent does is long-running and mostly unattended; a laptop is the worst possible place to run it.

Guppy is my answer: one small, always-on cloud box that hosts the agent, so the agent’s life is decoupled from any client I happen to be looking at it through. I can start a session, walk away, and it keeps grinding. I reattach from a different device an hour later and it’s exactly where I left it, still running. Durability isn’t a feature of Guppy — it’s the entire reason it exists.

What Guppy actually is

Guppy is a t4g.medium EC2 instance in Sydney that never turns off. It runs continuously and cheaply (~$24/mo), and it holds the things that need to be always-on:

  • Durable terminal sessions. Every Claude Code session runs inside a tmux session on Guppy. tmux is the thing that makes “walk away and it keeps running” true — the session is a server-side process, and every client is just a viewport onto it.
  • The agent harness itself. Claude Code runs on Guppy, not on my Mac. This is the inversion that matters: the machine I’m typing on is a thin client; the machine doing the work is the one that’s always up.
  • A Tailscale node. Guppy joins one private tailnet with all my devices. Access is Tailscale SSH — no public SSH port, no keys to manage, authorised by tailnet identity. Guppy is never exposed to the public internet.
  • An orchestration surface. A set of just verbs that summon and destroy the other, more expensive boxes on demand (more on that below).

The primitive I got wrong first

The first version of this platform used code-server — browser-based VS Code — as the editing surface, with the workspace mounted over sshfs. It worked. It was also wrong, and I scrapped the whole thing.

The realisation: for an agent harness, an IDE is the wrong primitive. I wasn’t editing files by hand and occasionally running a build; the agent was. What I actually needed was a durable terminal multiplexer and a way to reattach to it — not a rendered editor, not a filesystem mount. Everything downstream got simpler the moment I stopped trying to make it feel like an IDE and let it be what it was: a long-lived terminal that survives disconnection. The data-plane design, the networking, and the guardrails all carried straight over; only the surface I’d been staring at changed.

I mention this because it’s the kind of mistake that’s only obvious in retrospect. “Development in the cloud” pattern-matches to “VS Code in a browser,” and that instinct is exactly wrong when the developer is an agent.

The launcher: a browser door with no database

Reattaching to a named tmux session over SSH works, but it’s fiddly, and it doesn’t give you an overview. So Guppy runs a small Node service — the launcher — that renders the whole thing as a web page, fronted by Tailscale Serve so it’s reachable at a tailnet-only HTTPS URL from any of my devices, including the iPad.

The launcher shows a project → worktree → session tree. Each session is one durable tmux session; each project is a git checkout under ~/code; worktrees are the isolated branches I’m working on. You click a session and you’re in a live terminal, in the browser, attached to the real tmux process on the box.

The design decision I’m happiest with: there is no database. The launcher’s entire source of truth is reconciled live from git worktree list and tmux ls. It doesn’t maintain its own model of what exists and try to keep it in sync with reality — it derives the model from reality on every read. The only things it persists are the handful of facts those two commands can’t tell it: which non-~/code directories to also show, which scanned repos to hide, and which projects are pinned. Everything else is a projection of the actual git and tmux state. State you don’t store can’t drift.

Control plane and data plane

Guppy is small on purpose. A t4g.medium is fine for running an agent, editing files, and orchestrating — but it would fall over under a Docker build or a 32-GB TypeScript compile. So the platform splits in two.

Guppy is the control plane. Always-on, cheap, runs the agent and the light edits.

The heavy boxes are the data plane — cattle, created and destroyed on demand:

  • A fat Linux box (r7g.xlarge, 32 GB) that is the Docker host and build runner. Guppy carries the Docker CLI but no daemon; a plain docker build from a checkout on Guppy runs on the fat box over a remote Docker context, with the repo streamed as build context. No repo is ever copied over. The box stop/starts with a persistent data volume, and self-stops after 30 minutes idle.
  • A Windows + IIS box for a legacy Kentico compile-and-test gate — the one workload that genuinely can’t build on Linux.

The rule is: run the agent on the control plane; dispatch heavy execution to the data plane. Claude Code and small edits live on Guppy. docker build, big compiles, and the Windows gate get thrown at boxes that are summoned for the job and torn down after. My actual laptop — an M2 with 16 GB — never has to touch any of it.

The agent triggers; deterministic primitives execute

This is the safety principle the whole orchestration model is built on, and I think it’s the right one for agent-driven infrastructure generally.

The agent decides when. The primitives decide how. Claude never issues a raw destructive cloud command. It calls a just verb — box-up, box-down, vpn-up — and that verb is a deterministic, health-gated recipe that a human could run identically by hand. box-up is idempotent (calling it when the box is already up is a no-op that just returns the hostname). --wait-healthy polls real readiness — is Docker actually reachable — not a fixed sleep. The agent is a convenience layer over the primitives, never a single point of failure and never holding the sharp end of the knife.

The credentials follow the same discipline. The orchestrator’s IAM role can start, stop, create, and destroy resources tagged role=devorch — and nothing else. Not account-wide admin, just the tagged cattle. Least privilege isn’t a prompt instruction to the agent (“please be careful”); it’s a permission boundary the agent physically cannot cross.

Cost guardrails that don’t depend on the agent behaving

An always-on box plus summonable expensive boxes is a great way to wake up to a surprise bill if you’re not careful. The guardrails are deliberately independent of agent behaviour, because “the agent will remember to shut things down” is not a cost control.

The primary one is idle auto-stop, and it runs on each data-plane box, not in the orchestrator. A systemd timer on the fat box watches connection presence and CPU and self-stops after 30 minutes idle — but keys off real activity, so a two-hour build is not mistaken for “idle.” Even if the orchestrator forgets a box, even if I forget a box, the box stops itself. On top of that sits a budget-triggered stop: an AWS Budget action that fires a Lambda to stop the role=devorch resources at a hard monthly cap. The backstop lives below the layer that might fail.

The exit-node trick for customer networks

One genuinely fiddly requirement: some work needs to reach a customer’s network through a VPN whose egress IP the customer has whitelisted. A full-tunnel VPN can’t coexist with Tailscale on macOS — but Linux can policy-route, so this is a job for Guppy.

The customer VPN client runs on Guppy. Guppy keeps its own traffic on its normal interface, but policy-routes traffic it’s forwarding on behalf of tailnet peers out through the VPN tunnel — acting as a Tailscale exit node. Any of my devices opts in with a single tailscale set --exit-node=guppy while it needs customer access, and opts back out afterward. The customer sees one stable whitelisted egress; I get customer-network access from any device on the tailnet without a VPN client on that device and without a dependency on my Mac being awake. The sensitive VPN credentials live on Guppy only, and the tunnel comes up only when it’s needed.

What it grew into

Guppy started as “somewhere durable to run Claude Code.” It turned out that an always-on, agent-hosting, tailnet-connected box is a useful place to hang other always-on helpers, and it accreted them: a browser-automation agent the other agents can drive, an inbound-message listener, an on-demand public publishing tool, a shelf for clean copyable snippets. The control plane became a small personal-automation platform almost by accident, because the hard part — a trustworthy always-on box with a clean way to reach it — was already solved.

That’s the real lesson of the project. The valuable thing wasn’t any one feature. It was getting the foundation right: durable sessions decoupled from the client, a cheap always-on control plane that dispatches expensive work to summonable cattle, and an agent that triggers deterministic primitives rather than wielding raw infrastructure. Everything else is a tool hung on that frame.

If you’re running agents somewhere other than the laptop you’re staring at, I’d be glad to compare notes — j@jaym.cc.