18 April 20265 min read

The boring layer does the real work

Businesses keep asking for an AI that does the whole job. The job usually has a part that has to come out the same every time, and that part is what AI is worst at. Build the plain, rule-based layer first. Then let the AI read it.

ByJames Dodd

Note filed under:

There's a kind of work that earns its keep by being boring. Same input, same output, every time. A defined check with a defined answer. Run it today, run it next quarter, the result has to match for the same data, or the check isn't worth running. It's also the kind of work people most want to hand to an AI. Which is work AI does badly, expensively, and inconsistently. A script solved this thirty years ago, runs on nothing, and gives the same answer every time. So why does the request keep coming? Because AI is visible. It's in every pitch deck, every vendor demo, every board briefing. A rule-based script isn't. It's been sitting quietly in the accounts team's spreadsheet for a decade, and nobody wrote a think-piece about it.

A woman in a grey knit jumper and black scarf tilts her head back with a hardcover book resting face-down over her eyes. — The useful work often looks like nothing much is happening.Valeria Lendel / Unsplash

Ask an AI to summarise a security report twice, and you get two good summaries. Each one reasonable. Each one subtly different. One emphasises an access-control issue, the other leads with an outdated library. For a one-off read, that's fine. For a check you rerun every quarter and hold yesterday's numbers up against today's, it's a mess. You can't tell whether something has changed in the business, or whether the AI just felt like describing it differently this time. There's a subtler version of the same problem sitting underneath. If you let the AI do the gathering as well as the writing, the gathering is variable too. The AI decides which list of your software to look at. The AI decides how to interpret a fiddly edge case. The AI writes the little bit of code that pulls the data, on the fly, a fresh copy every run. Next quarter, asked the same question, it might make slightly different choices. The answer looks consistent because the AI writes well. The work underneath isn't. This is where the expensive mistakes happen. A team builds the whole thing with a single AI agent. It reports smoothly for a month. Six months in, someone spots a number that can't be right, and there's no paper trail to walk back through. The AI has been quietly filling in gaps the whole time. Nobody can prove what it did, because it didn't do the same thing twice.

Variation is a feature of AI and a bug in a check.

The fix is boring

Write the checking bit as an old-fashioned, rule-based script. Same input, same output, every run. Keep a record of every version, so you can see exactly what it did on any given date. Log what it looked at and what it returned. If it breaks, it breaks loudly, which is the only kind of breaking you want. Then, and only then, hand the output to the AI. Ask it to read the results. Summarise the patterns. Flag what looks concerning. Draft the covering email to the board. That's what AI is genuinely good at: turning a table of numbers into plain language a human can read on a Monday morning. It isn't guessing at what the numbers are. The numbers are in front of it.

A worked example

We're doing a version of this, at a larger scale, with a UK-based educational charity and awarding organisation. They have a lot of processes that they want AI to help run. Marking, moderation, quality checks, supplier reviews, internal reporting. The temptation, quite reasonable from the outside, is to hand each process to an agent and ask it to get on with it. The work we're doing with them sits a step earlier. Before any agent runs anything, we sit with the people who actually do the job and map the process out. A service design exercise crossed with a process audit. Where does this start, what comes in, what gets decided, by whom, using what, and where does the output go next. Step by step. Nothing skipped. Then each step gets a label, and this is where the difference between two kinds of software starts to matter. Some steps are rules: given these inputs, always produce that output. A moderation tally. A supplier-spend total. A check that every assessor on a panel has a current DBS record on file. These become rule-based scripts (a fixed set of instructions that runs the same way every time, the way a payroll calculation or a VAT worksheet does). Boring on purpose. You can audit them, rerun them, and the answer matches. Other steps are reading and summarising. Pulling the gist out of a hundred assessor comments. Drafting a covering note for the board. These go to a generative AI model (the thing behind ChatGPT or Claude, trained on a lot of writing to produce fluent answers, and which gives a slightly different answer each time you ask). That variation is fine when the output is a sentence a human will read and push back on. It is not fine when the output is a number that has to match next quarter. And some steps are neither. The judgement matters and a person has to sign it. Those become review gates with a name attached.

Then, and only then, we hand the map to agents. Same steps, same rules, same models, same storage locations, every time. The variation lives inside the AI steps, where it belongs, and the rest of the work is a script you can audit. This is what the marriage of rules and AI actually looks like in practice. It isn't a clever model doing everything. It's a carefully mapped process, with each step using the tool that fits it, and the whole thing visible enough that a human can walk in six months from now and see what happened on any given day.

None of this is anti-AI. It's a question about where the variation belongs. At the top of the work, where the output is a sentence and a human reader gets to push back on it, variation is fine. At the bottom, where the output is a number that feeds into next quarter's decision, variation is a liability. The shape, in practice, is often the opposite of what a client pictures at the start. They picture one clever agent doing the whole thing. What we build is a small boring layer doing most of the work, with clever things sitting on top at the points where cleverness helps. Less to demonstrate. More to trust. That's usually the right shape.

Written by

James Dodd

Founder of moralai.co. A design led problem solver, with a photojournalism background, who has spent the last decade building software, brands and products for small businesses and the third sector.

More notes

Have a question this raised?

Talk to us, not a sales deck.

A short call, no prep needed. We'll level with you on whether there's anything worth doing here.

Book a 20-minute call