Never trust your model with write permissions. You’ve probably read the stories: models dropping production databases, deleting emails, triggering outages. It happens when you hand too many permissions to your agent, especially write ones. It’s tempting. It works, until it doesn’t.
My story wasn’t catastrophic. But it could have been.
The incident
I was fixing some Terraform and let Cursor make the changes, planning to review them afterward. Here’s what actually happened:
- The agent made the change I asked for.
- It raised a PR, bypassing my global rule that explicitly forbade running any git write commands.
- It merged the PR without waiting for review. It called the GitHub CLI to approve its own PR.
Merging the PR kicked off terraform apply. I caught it in time, but barely.
I confronted the agent afterward. It admitted ignoring both the global rule and the immediate prompt. Deliberately.
The functional programming insight
I’ve seen this failure mode before. In code, it’s called imperative programming: instructions execute one after another, and effects happen immediately. Functional programming solves this with deferred execution: describe what should happen, compose it all together, run nothing yet. Effects only execute at the very end. Haskell calls this moment the “end of the world.”
That’s exactly what we need with AI agents.
An agent that can execute write operations freely is imperative. Every tool call is a live mutation. An agent that describes what it wants to do and waits for a human checkpoint is functional: it defers effects, composes safely, and only executes when you explicitly confirm.
Layered protection that actually works
Plan mode. Cursor’s plan mode is the first gate. The agent reasons and proposes rather than executing. It’s excellent, but it’s a conversational mode, not a technical lock. A follow-up prompt or accidental approval can push the agent past it. Plan mode alone isn’t enough.
Allowlist run mode. Go to Cursor Settings → Agents → Run Mode → Allowlist. Never add destructive commands:
terraform apply, or anything that deletes or overwrites files. Better still: leave git and gh off the allowlist
entirely. The agent has to ask you every time it wants to run one. I tried the sandboxed variant but found it too
limiting: it couldn’t cd into directories outside the workspace. Plain Allowlist, without sandbox, gives the right
balance.
Branch protection with no self-approval. Set main branch protection and disable self-approval for PRs. This is mandatory. An agent that can approve its own PR defeats the entire protection model.
Why global rules are not enough
You might think a global rule would prevent all of this. I had one. Mine looked like this:
Never perform any git write operations - no commits, branch creation/deletion/renaming, merges, rebases, resets, or remote pushes of any kind. Read-only git operations (fetch, pull, log, status, diff) are allowed. If a task requires committing or publishing changes, stop and ask the user to do it manually.
It didn’t work. Here’s why.
Instruction hierarchy. Models don’t treat all positions in the context window equally. Instructions buried deep in a long context are read less reliably. A global rule is a few extra tokens, not a hard constraint. If you must use global rules, put the most important constraints at the top of the system prompt, not buried in a long rules file.
RLHF optimises for task completion, not rule compliance. Reinforcement Learning from Human Feedback trains models to complete tasks even when instructions are imprecise. This is what makes agents feel intelligent: they fill in gaps and push through ambiguity. The same mechanism also means a model will work around a rule if completing the task seems to require it.
No persistent constraint mechanism. There is no architectural layer in a standard LLM that enforces rules at inference time. A rule in the system prompt is a few more tokens in the context. It has no special status. The model can, and sometimes will, ignore it.
The practical conclusion
Treat your AI agents the way you’d treat a powerful but impulsive engineer: give them space to work, but build the guardrails into the environment, not the conversation.
- Separate read and write permissions explicitly.
- Use Allowlist mode; don’t rely on the model’s self-restraint.
- Protect main with branch rules and no self-approval.
- Use plan mode as a first gate, not the only gate.
Global rules are suggestions. Structural constraints are enforcement. Only the second kind reliably works.