Sandboxing Claude Code on macOS: Stop Yolo Mode Without Losing Power
Wrap every Claude Code spawn in sandbox-exec -f <profile> with a Seatbelt profile generated from capability tokens. Default policy: read everything, write only inside the workspace, network on, subprocess on. Grant extra capabilities (screen recording, Accessibility, FDA) per-agent. Every block decision is in the audit log.
The first time you let Claude Code do git reset --hard on the wrong branch, you become permission-curious. The second time, you become sandbox-curious. The Anthropic permission prompt is great but it has two failure modes: you click "yes" without reading because you're tired, or you toggle --dangerously-skip-permissions because the prompts annoyed you that day.
macOS has a kernel-enforced sandbox available to every binary. It's called Seatbelt and it's been there since 2007. Apple uses it inside app sandbox. You can use it directly with sandbox-exec. This post is the production version of "I read the man page and built something useful."
Why people skip permissions (and what it costs)
--dangerously-skip-permissions exists because the prompt-on-every-tool-call gets old. The cost is real: an agent decides it needs to clean up node_modules across the whole filesystem, types the wrong path, and you have a bad evening. Sandbox prevents the worst outcome regardless of what the agent decides.
Seatbelt 101
A Seatbelt profile is an S-expression. You declare a default (allow or deny), then add specific allow/deny rules per operation. sandbox-exec -f profile.sb claude runs Claude Code under those rules. The kernel enforces them.
(version 1)
(deny default)
(allow process-fork)
(allow process-exec*)
(allow file-read* (subpath "/"))
(allow file-write* (subpath "/Users/me/CelistraAgents"))
(allow file-write* (subpath "/tmp"))
(allow network*)
(allow mach-lookup
(global-name "com.apple.system.notification_center"))
That profile reads anywhere, writes only in the workspace + /tmp, runs sub-processes (so git, npm, etc. work), and has network. Drop the network line and you've got an offline agent. Drop subprocess and Claude can read but can't run anything.
The capability-token model
Hand-writing Seatbelt profiles per agent is brittle. Capability tokens make it composable. Each token is a string with a meaning:
workspace_rw— read+write within the chosen workspaceread_anywhere— read everywhere on the filesystemnetwork— outbound networksubprocess— fork/exec child processesscreen_capture— Mach service for CG screen APIsaccessibility— AX events (drive other apps via AppleScript / control)full_disk_access— bypass TCC fences (Mail, Messages, Safari history)read:<path>·write:<path>·exec:<path>— granular path grantsprivileged— escape hatch (no profile)
The default bundle is {workspace_rw, read_anywhere, network, subprocess}. That's the right policy 95% of the time: the agent can read your code, run tools, talk to the internet, and not destroy anything outside the workspace.
Per-agent grants
A single config file (~/.celistra_capabilities.json in our daemon's case, but the pattern is generic) lists grants:
{
"grants": [
{
"match": { "agentNamePattern": "screenshot-*" },
"capabilities": ["screen_capture"]
},
{
"match": { "command": "claude --dangerously-skip-permissions" },
"capabilities": ["privileged"]
}
]
}
Match block is AND-within-grant; OR-across-grants. An agent named screenshot-friday gets the default bundle plus screen_capture. An agent invoking --dangerously-skip-permissions still drops the sandbox — but you've consented to that explicitly, in version-controlled config, instead of by mashing y/y/y at 11pm.
The audit log
Every spawn writes a row: agent, command, capabilities resolved, profile path, exit code. Every kernel block decision (when the agent tried to write outside the workspace) writes a row. The log is hash-chained — sha256(prev_hash || row) — and verified hourly.
Practically, this means: when something weird happens, you can answer "did the sandbox engage?" with a query, not a guess.
Restart-to-apply for TCC permissions
macOS caches some TCC grants per-process. Granting Screen Recording or Full Disk Access mid-session won't flip to "granted" in your daemon until the daemon is restarted. The daemon should detect this and offer a restart button (we do) — and persist the user's wizard step so the wizard resumes on the right screen after re-exec.
What this doesn't stop
Sandbox is a hard barrier on filesystem and network. It does not stop the agent from making bad commits, leaking secrets in git diff, or being prompt-injected into doing something silly with the capabilities you did grant. Sandbox is a containment story, not a behavior story.
Linux equivalent
For Linux, the equivalent is bubblewrap + namespaces, or firejail. The capability-token concept maps cleanly: same JSON, different profile generator. macOS is where the daily-driver dev box lives so we put the polish there first.
What we ship
This is exactly how Celistra spawns every agent. The default-on sandbox is the v2 daemon's posture. The user can flip the master switch off via the tray ("Sandbox: OFF") for an unrestricted run; the audit log shows when that happened and which agents ran during. Per-agent grants live in ~/.celistra_capabilities.json.
FAQ
Does this work for Cursor too?
Yes — Cursor's terminal-spawned tooling runs under the same sandbox if you launch Cursor's tools through the daemon. Cursor's IDE itself isn't sandboxed (it needs broad access for editor features), but the agent processes it spawns can be.
Does sandbox-exec slow Claude Code down?
Negligibly. Seatbelt is kernel-implemented and the per-syscall check is sub-microsecond. The startup cost is one fork + profile compile (~5ms).
What about network in sub-shells?
Sub-shells inherit the sandbox profile. If the agent runs npm install, that npm gets the same sandbox. Drop the network capability and the install fails — same as if you'd unplugged the cable.
Can I exfiltrate data through DNS?
Yes if network is allowed. Sandbox doesn't inspect packet contents. Pair sandbox with egress firewalls (Little Snitch, pf rules) if exfiltration is in your threat model.
Where's the hash-chained audit code?
Open source as part of Ujex's audit subsystem — github.com/axysar/audit-chain in TS and Python. Same code Celistra uses.