Hacker News
An Update on TinyKVM
swiftcoder
|next
[-]
deivid
|root
|parent
[-]
See: https://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine
radeeyate
|root
|parent
|next
[-]
3eb7988a1663
|next
|previous
[-]
Is there any way that TinyKVM + KVM Server could ever be made to work with a GUI program? The sandboxing performance seems free and possibly safer than other solutions.
Instead of firejail or bubblewrap would it ever be possible for me to wrap say Firefox (or a much less complicated GUI program) inside of TinyKVM and restrict it to just network access and reading/writing to ~/Downloads? Likely a way more ambitious target than you had ever imagined, but I can dream.
I am wondering if I could default wrap every command on my terminal to run inside a TinyKVM, no network access, and only permissions to the current directory or below.
jchw
|root
|parent
|next
[-]
It might be easier to adapt gVisor to handle this sort of workload. Adjacent comment mentions Qubes which does the same thing but uses an entire guest kernel.
(If you are creative enough, you can probably come up with some solutions. Qt apps could be made to work with a custom QPA that can somehow funnel information in and out of the sandbox. You could definitely run something like Waypipe or Xpra in the sandbox too, but again I imagine those would wind up requiring a much greater degree of emulation. It's not like I've actually tried this, though, so I could be off.)
laurencerowe
|root
|parent
[-]
Running sys calls on the host means there is approximately 1µs overhead per syscall from exiting and entering KVM so I'm not sure how well that would work for GUI applications.
And we currently only have very rudimentary support for threads, enough for a server program with ancillary threads to boot up but the expectation is currently that the call into TinyKVM only runs a single thread and we fork multiple copies of the VM to handle requests in parallel.
jchw
|root
|parent
[-]
That made me rather curious how many syscalls a complex GUI application might issue. I wanted to see how many syscalls were happening across my entire system. Thanks to StackOverflow I have a snippet that seems correct[1]:
> perf stat -e raw_syscalls:sys_enter -a -I 1000 sleep 5
Using this, it seems that most programs (as you would probably guess) don't execute a whole lot of syscalls when they're idle. However, starting a complex GUI program definitely causes a pretty massive flurry of syscalls. Starting winecfg without an already-existing wineserver spews a lot of syscalls, somewhere in the neighborhood of 500,000. If we assume that each syscall takes on average around 2µs including the overhead and that they're all serial, I guess that would add up to about 1 second spent on syscalls. That's probably making way too many assumptions, but it does make me feel like it's not completely infeasible to run GUI applications inside of a sandbox like this, though it may very not be compelling when the overhead is factored in.
And of course, just because it could be done does not mean it should, anyway. Even if this is a good idea, I doubt it makes any sense for TinyKVM to be attempting to do it. What TinyKVM does do is already very interesting and probably a lot more practical anyways. It'd probably be better to fork off or build an entire purpose-built sandbox for GUI software, realistically.
Still, pretty interesting stuff to think about.
> And we currently only have very rudimentary support for threads, enough for a server program with ancillary threads to boot up but the expectation is currently that the call into TinyKVM only runs a single thread and we fork multiple copies of the VM to handle requests in parallel.
BTW, I think this design is really cool. This is something I have wanted to exist for a while, even though I don't practically need it.
sheepscreek
|root
|parent
|next
|previous
[-]
munchlax
|root
|parent
|next
|previous
[-]
time nice distcc ccache gmake
I do this with other tools as well. bwrap, chroot, env, setpriv, xchpst, etc. They all stack.
3eb7988a1663
|root
|parent
[-]
I instead lean on heavyweight VMs, but would love something like this which should be a hard security boundary for little cost.
wmf
|root
|parent
|next
|previous
[-]
3eb7988a1663
|root
|parent
[-]
A security/isolation layer like this I could use for free feels like it would get me so close to the Qubes ideal without having to completely change how I interface with my machine.
pgaddict
|root
|parent
[-]
rolandog
|root
|parent
|previous
[-]
[0]: https://www.futurile.net/2023/04/29/guix-shell-virtual-envir...
jchw
|root
|parent
[-]
What I think we're really after though is something like gVisor, where the guest program is completely isolated from the host kernel, and the daemons that allow the guest program to reach the outside world are themselves highly locked down by the host kernel using technologies like seccomp-bpf and namespacing, on top of whatever constraints and validation they apply on their own. While nothing is foolproof, this feels like, if done carefully, it would give you a very good layer of isolation that would be extremely challenging to bypass. I reckon that the sandbox would cease to be the most interesting attack target in a system like gVisor, since in any complicated system, there will probably always be some lower-hanging fruit. (And of course, TinyKVM seems to be basically in the same wheelhouse. None of these solutions are designed to run GUI software, though I reckon it probably could be made to work.)
munchlax
|root
|parent
[-]
I think it should be possible to pass /dev/kvm as an open fd to daemons like kvm server and mark it as non-inheritable. As long as the vm is in a subprocess it would be okay I guess.
mattbee
|next
|previous
[-]
laurencerowe
|root
|parent
[-]
TinyKVM: Fast sandbox that runs on top of Varnish - https://news.ycombinator.com/item?id=43358980
Deno Under TinyKVM in Varnish - https://news.ycombinator.com/item?id=43650792
dinobones
|next
|previous
[-]
I was confusing it with TinyPilot, a hardware KVM made by an indie hacker Michael Lynch, that I think has since been acquired.
laurencerowe
|next
|previous
[-]
Having each request start from the exact same program state should make reproducing and fixing production issues easier. In a way it combines the predictability of the CGI programming model with the speed of a warmed modern JIT runtime.
skybrian
|next
|previous
[-]
sterlinm
|next
|previous
[-]
nl
|next
|previous
[-]
laurencerowe
|root
|parent
[-]
This minimises memory usage and lets us track file descriptors which lets us very quickly reset the guest process (under 100us for deno.)