Burn It All Down

AI risk management in software development

May 23, 2026

In the end, it was fine.

Before heading out to an evening event, I asked Claude Code to address a problem with auto-registration of Kubernetes services in my VPN’s dynamic DNS. All of it was configured in Ansible, Helm and Terragrunt, and we went through an extensive back-and-forth about what needed to be done. Along the way Claude indicated that gaps in the documentation meant it was not sure the right way to execute the change, and it recommended a spike. I thought nothing of this, because my assumption was that I would be the one running the spike in the system console after Claude wrote the scripts and configurations for the task.

Mistake number one: underestimating an LLM’s desire to get shit done.

So I tee’d up the problem, set it to run automatically, and walked out the door with Bypass Permissions activated because it was isolated in a dev container and only writing configuration files. Right? Bueller?

Mistake number two: forgetting that dev containers have lots of cool toys lying around that human developers use to get shit done.

Three hours later, I came back, and saw the report that the spike had determined the correct configuration and Claude was asking about next steps. Of course, my first reaction was: what the unholy fuck, how did you run the spike from inside the container, you hyped-up answering machine? I scrolled back, and saw that Claude, determined to run the spike, had started hunting around to see if:

SSH allowed login to the dev server without a password (no)
The password vault CLI allowed access without a password (hell no)
If the remote control CLI needed for the bare-metal Kubernetes stack was installed locally (no, but it could install it itself with curl, wait, WHAT?!?)
If the Kubernetes config was up to date (no, but it could get it from the above!)
If it could do what it needed with kubectl (yep!)

Mistake number three: forgetting to tell Claude that those cool toys are not yours.

So then the clever little thing started hacking its own dev server through kubectl, extracting secrets and running remote commands. Even better, one of those secrets let it generate a Bearer token to remotely access the VPN API so it could remotely reconfigure. Playtime begins! It then merrily went about trying different configurations live until it figured out the not-very-well-documented way to achieve what I had asked it to do in the first place.

Needless to say, I had mixed feelings about this incident.

Obviously, there was a degree of shame that I had been this careless. On the other hand, I was definitely impressed by Claude’s persistence and ability to systematically do what, quite honestly, I would have done much more slowly if I had run the spike. Dirty little secret: a lot of developer time gets burned on trial-and-error like this. Plus, I had a good time seeing friends at the pub while it was doing this, so there’s that.

So, since the whole point of this nights-and-weekends journey has been to work out what’s possible at the frontier (Dario, a polite reminder: my Project Glasswing invite seems to have been lost in the mail), naturally my very next question was how to do this stupidly reckless thing properly. It’s a question I had already put to some senior engineers I know as, “what would it take for you to give Claude SSH keys to production?” The instinctive reaction is “I’d never do that" — despite much higher levels of comfort with writing code that runs in production.

Furthermore, we often do let developers into production (audited, time-bounded, limited), and we typically grant them even wider access to lower environment tiers. Also, a typical developer workstation is a pimped-up muscle car with everything you need to get the data, telemetry and secrets you need to do your job as well as the tools to do something about it. You can say, well, it all goes through pull request review, four-eyes approval, etc., before it ever gets to production, but if you believe this catches absolutely everything when someone posts a 3000 line PR, I have a Brooklyn Bridge-sized zero day exploit to sell you. And it ignores the fact that gaining elevated privileges on a dev or UAT server still gives you something with a fat multi-gigabit pipe and 24 cores to, you know, DDoS everybody on someone else’s dime.

Developers need trusted access to do their jobs. We can narrow the scope and put in all kinds of protections, but at the end of the day, a developer or agent who can only read and write code is far less productive.

In short, we need to find a way to get OK with something that feels very much not OK.

At risk of sounding like a frontier model company CEO pumping his pre-IPO bag, the answer probably is more agents. I took the entire session log and fed it back to Claude and asked it to define a new agent type whose job was auditing agent capabilities as well as more general cybersecurity audits, and then asked it to replay the session log through the new agent to see what it would have blocked (495 bash commands). That’s still not good enough, but probably enough for tinkering at home.

To go beyond, we need to get away from the current spit-and-baling wire approach of skills and agent harnesses built on top of bash, awk and sed. There needs to be an integrated policy and execution layer, model- and harness-independent, that can itself be audited, and then the agents skills all have to go through that. Policies have to be tier-aware: local > dev > UAT > production. The policy language has to be rich enough to embrace not just whitelists and blacklists but time and rate limits too. Anything that looks like a script needs to be parsed and understood structurally. We need this to unlock the full productivity benefits of AI-assisted engineering; without it we are faced with the unholy choice of giving agents unsafe levels of access or crippling them by limiting access to just modifying code and configuration.

Kyle Downey

Discussion about this post

Ready for more?