Prayer-Driven Development
The term “Prayer-Driven Development” was coined by Davor Pavleković for deployments where hope is your only strategy. That was before AI agents made it an industry standard.
Everyone Did Their Part. Nobody Saw the Whole.
You’ve been here. The prompt that needs “MANDATORY” in all caps. The instruction that says “DO NOT SKIP” because the model skipped it three times already. The system prompt that reads like a legal contract because you can’t trust the other party.
That’s not engineering. That’s negotiation. And you’re losing.
I’ve seen this pattern before - not with AI, but with teams.
A few years back, mid-COVID, I joined a company that had gone from academic proof-of-concept shop to enterprise production overnight. The parts were all there: management in the UK, scientists in Switzerland, developers in the EU, QA in India. Fully remote, fully fragmented. The UK wrote book-length requirements explaining how a cryptographic algorithm works - no use cases. Switzerland collected papers. The EU wrapped an age-old library and shipped it. India QA tested the wrapper and never touched anything a user would actually experience.
Everyone did their part. Nobody saw the whole.
A PM and I - both from enterprise backgrounds, opposite sides of the world - introduced BDD. The specs became scenarios describing what a user does, not how an algorithm works. For the first time, London, Zurich, Ljubljana, and Pune were reading the same document and understanding the same thing.
The shift wasn’t technical. It was: the person who owns the outcome could finally see what the system was supposed to do.
The agent world hasn’t made that jump yet.
What Actually Worked
Several teams, independently, landed on the same fix in the last year. Instead of letting the model call tools one at a time - decide, call, read the result, decide again, burn tokens at every step - they had it write a small program in a DSL (a domain-specific language - a stripped-down language designed for one job). The program runs in a simple interpreter. Deterministic. Verifiable. One pass instead of many.
One team measured it in production: a third of the tokens, half the latency, one model call instead of three. Another built a Clojure-like interpreter on the BEAM specifically for agent-generated code - sandboxed, process-isolated, no filesystem or network access by design. A Hacker News thread on the topic surfaced more practitioners who’d arrived at the same conclusion independently. The sample is small. The direction is consistent.
Why DSLs and not Python or JavaScript? Because the simpler the grammar, the fewer ways a model can produce something that looks right but isn’t. The teams that went furthest reached for s-expressions - the parenthesized notation from Lisp, where the syntax is so regular that the program structure and the data structure are the same thing (a property called homoiconicity). For a token predictor, that regularity means fewer failure modes.
The pattern: stop using the LLM as the runtime. Let it plan. Let something predictable execute.
flowchart LR
A[LLM plans] --> B[DSL program]
B --> C[Interpreter executes]
C --> D[Human reviews output]
This works. It’s also the wrong conversation.
The Problem They’re Not Solving
Every one of those projects solves the engineering problem. None of them solve the ownership problem.
Who reads the DSL output? Who verifies the program the model wrote? Who approves it before it touches production?
An engineer. Always an engineer.
That’s fine if your entire organization is engineers. It’s not fine if the person who owns the process - who knows what should happen, who carries the responsibility when it breaks - can’t read what the AI proposed.
The agent problem isn’t prompting. It isn’t tooling. It’s that the people who own the processes can’t see what the AI is doing.
The DSL You Can See
Here’s a simple task: a customer message comes in. Check if it’s urgent. If yes, notify the team on Slack. If not, add it to the backlog.
As an s-expression:
(let [msg (tool/get_message)
urgent? (tool/classify_urgency msg)]
(if urgent?
(tool/slack_notify {:channel "#alerts" :text (:body msg)})
(tool/add_to_backlog {:item msg :priority "low"})))
As Python:
msg = get_message()
if classify_urgency(msg):
slack_notify(channel="#alerts", text=msg["body"])
else:
add_to_backlog(item=msg, priority="low")
As an n8n workflow:
Show all three to a director of operations. Ask which one they’d sign off on in a fifteen-minute call.
n8n workflows are JSON underneath - declarative, inspectable, deterministic once deployed. That JSON is a DSL. But you don’t read it. You see it as a visual flow: boxes, connections, logic you can follow with your eyes.
A workflow is a program. Not a prompt. Not a probability. A defined sequence with defined inputs, defined outputs, and defined failure modes. You set it up. It runs. It’s infrastructure.
This scales cleanly to a point. A five-node triage flow is obvious. An eighty-node workflow with conditional branching and error handling is not - but it’s still more legible than the equivalent prompt chain, and n8n supports sub-workflows for exactly this reason: decompose until each piece is readable again.
And because it’s visual, it’s auditable by the person who owns the process - not just the person who built it. A team lead can look at an n8n workflow and understand what it does. Try that with a prompt chain. Try that with a Python agent. Try that with an s-expression.
n8n has an API. It has MCP support. An LLM can draft workflows - propose them, structure them, connect the nodes. Then a human looks at it. Visually. Understands it. Approves it. Deploys it.
The LLM never touches production. It’s the drafter, not the executor. The review step isn’t overhead. It’s the architecture.
That’s not a convenience feature. That’s a governance model.
Full disclosure: I’m an n8n ambassador. I chose n8n before the title, for the reasons above.
TL;DR
- Prayer-Driven Development: your system prompt reads like a legal contract because you can’t trust the model to follow it. That’s not engineering - that’s negotiation.
- DSLs work: teams using s-expression DSLs for agent execution measured ~60% fewer tokens, ~50% lower latency, 1 model call instead of 2-3. Sandboxed interpreters on the BEAM take it further - no filesystem, no network, by design.
- The ownership gap: every DSL project solves the engineering problem. None solve the fact that the person who owns the process can’t read the output.
- Visual workflows are DSLs: n8n workflows are declarative JSON - inspectable, deterministic, auditable by non-engineers. The LLM drafts. The human reviews. Visually.
- Governance, not convenience: if the person responsible can’t verify what the AI proposed, you don’t have a deployment strategy. You have a prayer.
The hardest problem in AI deployment was never technical. It was: can the person responsible actually verify what’s happening? If they can’t see it, they can’t own it. If they can’t own it, you’re back to prayer.