Hacker News
Ask HN: How are you sandboxing coding agents?
For folks actually using these tools day-to-day:
What’s your default setup?
Have you had any "learned the hard way" moments?
What tradeoff (safety vs convenience vs parallelism) has mattered most in practice?
I'm less interested in theoretical best practices than what's actually holding up under real use.
netcoyote
|next
[-]
- SandVault (https://github.com/webcoyote/sandvault) runs the AI agent in a low-privilege account
- ClodPod (https://github.com/webcoyote/clodpod) runs the AI agent inside a MacOS VM
In both cases I map my code directories using shares/mounts.
I find that I use the low-privilege account solution more because it's easier to setup and doesn't require the overhead of a full VM
sixhobbits
|next
|previous
[-]
Most common is deleting files etc but if you're using git and have backups it's barely noticeable
estimator7292
|root
|parent
|next
[-]
I have more important things to waste my time on than writing absurd sandboxes to run AI agents without guardrails in. What even?
OJFord
|root
|parent
|previous
[-]
Backups are great when you know you need to restore.
Wowfunhappy
|root
|parent
|next
[-]
Of course, AI is not a real person, and it does make mistakes that you or I probably would not. However, this class of mistake—deleting completely unrelated directories—does not appear to be a common failure mode. (Something like deleting all of ~ doesn’t count here—that would be immediately noticeable and could be restored from a backup.)
(Disclaimer, I’m not OP and I wouldn’t run Claude with —-dangerously-skip-permissions on my own system)
gspetr
|root
|parent
|previous
[-]
If it is a directory that gets deleted, then you can diff it with a previous state. If you don't control the state and don't know the surface area that you should observe, then yes, you're inviting trouble if agents run amok.
jq-r
|next
|previous
[-]
languid-photic
|next
|previous
[-]
A big lesson for us is that you still need to be careful even in a sandbox.
We've been running Claude/Codex/Gemini in sandboxed YOLO mode and have seen some interesting bypass attempts. [1]
A few examples:
- created fake npm tarballs and forged SHA‑512s in our package‑lock.json
- masked failures with `|| true`, making blocked operations look successful
- cloned a workspace, edited the clone, then replaced the workspace w the clone to bypass file‑path deny rules
So, we’ve learned to default to verbose logging, patch bypasses as we see them, and try to keep iteration loops short.
kasey_junk
|root
|parent
[-]
languid-photic
|root
|parent
[-]
And actually, one way we've hardened our sandbox is by tasking agents with impossible tasks (within the sandbox), then analyzing and patching each workaround.
scuff3d
|next
|previous
[-]
subsection1h
|root
|parent
[-]
For anyone out there who thinks that containers are a sandbox...
There's a reason why gVisor exists:
https://github.com/google/gvisor#why-does-gvisor-exist
There's a reason why secureblue doesn't use containers:
https://news.ycombinator.com/item?id=45045190
There's a reason why Qubes OS doesn't use containers.
foreigner
|next
|previous
[-]
solresol
|next
|previous
[-]
jacob019
|next
|previous
[-]
Check it out: https://github.com/jacobsparts/agentlib/blob/main/src/agentl...
The framework is all python, but I used C for this helper. It uses unprivileged user namespaces to mount an overlay and run an arbitrary command, then when the command finishes, it writes a tarball of edits, which I use to create a unified diff. The framework orchestrates it all transparently, but the helper itself could be used standalone. Here's a short document about the sandbox in the context of it's use in my project:
https://github.com/jacobsparts/agentlib/blob/main/docs/sandb...
I also have a version that uses SUID instead of unprivileged user namespaces, available by request.
I often use claude code with --dangerously-skip-permissions but every once in a while it bites me. I've learned to use git for everything and put instructions to always commit BEFORE writes in CLAUDE.md. Claude can go off the rails on harder bug fixes, especially if there are multiple rounds of context compacting, it can really screw things up. It usually honors guidance not to modify outside of the project, but a simple sandbox adds so much, after the session is over you can see what changed and decide what to do with it. It really helps with the problem where it makes unexpected changes to the codebase, which you might not even notice otherwise, which can introduce serious bugs. The permission models of all the coding agents are rough--either you can't get anything done, or you throw caution to the wind. Full sandboxes are quite restrictive, which is why I rolled by own. Honestly your best option right now is just to have good version control and run coding agents in dedicated environments.
___timor___
|next
|previous
[-]
throwayaw84330
|next
|previous
[-]
It uses bubblewrap (no root needed) and only exposes ~/.cache stuff and the current folder (no git credentials, no ssh credentials, and as few permissions as it's feasible).
bubblewrap is a little bit more lightweight than docker (afaiu no overlayfs, launches way faster), but has the same underlying mechanisms for security (cgroups)
jomcgi
|next
|previous
[-]
I wanted something like Claude code web with access to more models / local LLMs / my monorepo tooling, so far it's been great.
The output is a PR so it's hard for it to break anything.
The biggest benefit is probably that it makes it easier to start stuff when I'm out - feels like a much better use of downtime like I'm not waiting to get home to start a session after I have an idea.
The monorepo tooling is a bit win too, for a bunch of things I just have 1 way to do it and clear instructions for them to use the binaries that get bundled into new sessions so it gets things "right" more often.
yomismoaqui
|next
|previous
[-]
I don't run Claude Code in YOLO mode, I just approve commands the first time I'm asked about them.
Using them since July I haven't found any problem with data loss and the clanker have not tried to delete my $HOME.
notarobot123
|root
|parent
[-]
Who'd have imagined remote code execution as a service would have caught on as much as it has!
stavros
|next
|previous
[-]
Havoc
|next
|previous
[-]
Keen to give firecracker another go though. Last I explored that it still felt pretty rough. (on UX not tech quality)
gl-prod
|next
|previous
[-]
aussieguy1234
|next
|previous
[-]
After a bit of tinkering I was able to get it to all run fine in Firejail, I wrote a guide here https://softwareengineeringstandard.com/2025/12/15/ai-agents...
Fairly basic, limits the agents write access to my projects, all of which are backed up in git.
techsystems
|root
|parent
[-]
On step 2, it's only jailing VS Code. Shouldn't it also jail the Git repo you're working on (and disable `git push` somehow), as well as all the env libs?
Also, isn't the point of this to auto approve everything?