Hacker News
Coding Agent VMs on NixOS with Microvm.nix
the_harpia_io
|next
[-]
The execution sandbox stops the agent from breaking out during development, but the real risk is what gets shipped downstream. Seeing more tools now that scan the generated code itself, not just contain the execution environment.
nh2
|root
|parent
|next
[-]
The goal of such sandboxing is that you can allow the agent to freely write/execute/test code during development, so that it can propose a solution/commit without the human having to approve every dangerous step ("write a Python file, then execute it" is already a dangerous step). As the post says: "To safely run a coding agent without review".
You would then review the code, and use it if it's good. Turning many small reviews where you need to be around and babysit every step into a single review at the end.
What you seem to be asking for (shipping the generated code to production without review) is a completely different goal and probably a bad idea.
If there really were a tool that can "scan the generated code" so reliably that it is safe to ship without human review, then that could just be part of the tool that generates the code in the first place so that no code scanning would be necessary. Sandboxing wouldn't be necessary either then. So then sandboxing wouldn't be "half the picture"; it would be unnecessary entirely, and your statement simplifies to "if we could auto-generate perfect code, we wouldn't need any of this".
ryanrasti
|root
|parent
|next
|previous
[-]
Sandboxes provide a "default-deny policy" which is the right starting point. But, current tools lack the right primitives to make fine grained data-access and data policy a reality.
Object-capabilities provide the primitive for fine-grained access. IFC (information flow control) for dataflow.
mystifyingpoi
|root
|parent
|previous
[-]
See, my typical execution environment is a Linux vm or laptop, with a wide variety of SSH and AWS keys configured and ready to be stolen (even if they are temporary, it's enough to infiltrate prod, or do some sneaky lateral movement attack). On the other hand, typical application execution environment is an IAM user/role with strictly scoped permissions.
rootnod3
|next
|previous
[-]
Is that really where we are at? Just outsource convenience to a few big players that can afford the hardware? Just to save on typing and god forbid…thinking?
“Sorry boss, I can’t write code because cloudflare is down.”
Cyph0n
|root
|parent
[-]
Generally speaking, once you have a working NixOS config, incremental changes become extremely trivial, safe, and easy to rollback.
aquariusDue
|root
|parent
[-]
giancarlostoro
|next
|previous
[-]
ghxst
|next
|previous
[-]
0xcb0
|next
|previous
[-]
But the one-time setup seems like a really fair investment for having a more secure development. Of course, what concerns the problem of getting malicious code to production, this will not help. But this will, with a little overhead, I think, really make development locally much more secure.
And you can automate it a lot. And it will be finally my chance to get more into NixOS :D
NJL3000
|next
|previous
[-]
https://github.com/5L-Labs/amp_in_a_box
I was going to add Gemini / OpenCode Kilo next.
There is some upfront cost to define what endpoints to map inside, but it definitely adds a veneer of preventing the crazy…
mxs_
|next
|previous
[-]
messh
|next
|previous
[-]
heliumtera
|next
|previous
[-]
Without nix I mean
rictic
|root
|parent
[-]
CuriouslyC
|root
|parent
[-]
0x457
|root
|parent
[-]
Cyph0n
|root
|parent
[-]
An alternative is to “infect” a VM running in whatever cloud and convert it into a NixOS VM in-place: https://github.com/nix-community/nixos-anywhere
In fact, it is a common practice to use the latter to install NixOS on new machines. You start off by booting into a live USB with SSH enabled, then use nixos-anywhere to install NixOS and partition disks via disko. Here is an example I used recently to provision a new gaming desktop:
nix run github:nix-community/nixos-anywhere -- \
--flake .#myflake \
--target-host user@192.168.0.100 \
--generate-hardware-config nixos-generate-config ./hosts/kelibia/hardware-configuration.nix
At the end of this invocation, you end up with a NixOS machine running your config.
clawsyndicate
|next
|previous
[-]
alexzenla
|root
|parent
|next
[-]
The performance of gVisor is often a big limiting factor in deployment.
souvik1997
|root
|parent
|next
|previous
[-]
alexzenla
|root
|parent
[-]
secure
|root
|parent
|next
|previous
[-]
I’m curious what gVisor is getting you in your setup — of course gVisor is good for running untrusted code, but would you say that gVisor prevents issues that would otherwise make the agent break out of the kubernetes pod? Like, do you have examples you’ve observed where gVisor has saved the day?
zeroxfe
|root
|parent
|next
[-]
The huge gVisor drawback is that it __drastically_ slows down applications (despite startup time being faster.)
For agents, the startup time latency is less of an issue than the runtime cost, so microvms perform a lot better. If you're doing this in kube, then there's a bunch of other challenges to deal with if you want standard k8s features, but if you're just looking for isolated sandboxes for agents, microvms work really well.
clawsyndicate
|root
|parent
|previous
[-]
alexzenla
|root
|parent
|next
[-]
The middle ground we've built is that a real Linux kernel interfaces with your application in the VM (we call it a zone), but that kernel then can make specialized and specific interface calls to the host system.
For example with NVIDIA on gVisor, the ioctl()'s are passed through directly, with NVIDIA driver vulnerabilities that can cause memory corruption, it leads directly into corruption in the host kernel. With our platform at Edera (https://edera.dev), the NVIDIA driver runs in the VM itself, so a memory corruption bug doesn't percolate to other systems.