Hacker News
Launch HN: Minicor (YC P26) – Windows desktop automations at scale
We were working on non-RPA integrations when a customer promised to sign a deal in 2 days if we could unblock a sale of theirs that involved integrating with a clinic’s Windows based medical record system. We didn’t know it at the time but it turns out that building desktop RPAs at scale is extremely difficult because scripting is hard (learning the system, defining the automation, UIs changing constantly), orchestration is hard (is the VM up? queuing, parallelizing) and debugging is hard (zero observability, false positives, cascading failures). 30%+ failure rates are not uncommon. At scale we’ve seen cases of failed RPAs leading to thousands of support tickets a month.
To solve the problems we were facing, we built an MCP that Claude Code/Codex can use to navigate a virtual machine running desktop software with Python to create RPA workflows. The RPA workflows run as Python scripts for speed, cost, and determinism. These workflows can be triggered by API following any input/output schema specified, with video replays and logs stored with each run. The MCP can debug RPAs and make changes to the underlying code, all of which are version controlled. We also built tools for cloning VMs for parallelizing RPAs, and handling 2FA/OTP challenges. Plus since workflows are code based: we were also able to add triggers for Slack notifications, human-in-the-loop steps, or call an LLM to verify the state of a VM by passing a screenshot.
Would love to hear your feedback and if you have any RPA horror stories! (:
polonbike
|next
[-]
throw03172019
|next
|previous
[-]
throw03172019
|next
|previous
[-]
fchishtie
|root
|parent
[-]
1. RPA code breaks (ex: throws an exception if a window does not exist) 2. RPA reports success but was clicking / typing in the wrong place 3. Underlying system breaks (virtual machine / legacy software)
the skill we have in our MCP is to build the RPA code to throw exceptions where possible so an LLM can understand the context and recover
to avoid false success states we add LLM vision steps in the workflow itself to error out if it sees that the system is in the wrong state
and for the underlying system breaking it can be as simple as having a CRON job that checks the status of the process / the health of the VM and running a script to reboot the system
it depends on the system but the pattern we've seen with RPAs is you can catch maybe 80% of the edge cases in the first week it's been rolled out
dragonsenseiguy
|next
|previous
[-]
ilundin
|next
|previous
[-]
fchishtie
|root
|parent
[-]
ilundin
|root
|parent
[-]
a-dub
|next
|previous
[-]
fchishtie
|root
|parent
[-]
previously writing RPA code used to take a long time - using AI (and its infinite patience) we can write more durable code that covers more edge cases
And since they’re code based it’s pretty straightforward to an agents monitor them and update their code when upgrades to the underlying system happen etc…
for observability - we have workflow execution logs that store text, videos and screenshots so an agent or a human can debug them - lots and lots of webhooks when things break ! (:
theaniketmaurya
|next
|previous
[-]
fchishtie
|root
|parent
|next
[-]
throw03172019
|next
|previous
[-]
mingabunga
|next
|previous
[-]
snozolli
|next
|previous
[-]
I think you meant premises.
throw03172019
|root
|parent
|next
[-]
snozolli
|root
|parent
[-]
I'm not suggesting that you correct your customers, but there's no reason to sink to the lowest common denominator when writing.