Skip to main content
5 min read

The boring layer does the real work

Businesses keep asking for an AI that does the whole job. The job usually has a part that has to come out the same every time, and that part is what AI is worst at. Build the plain, rule-based layer first. Then let the AI read it.

ByJames Dodd

There's a kind of work that earns its keep by being boring. Same input, same output, every time. A defined check with a defined answer. Run it today, run it next quarter, the result has to match for the same data, or the check isn't worth running. It's also the kind of work people most want to hand to an AI. Which is work AI does badly, expensively, and inconsistently. A script solved this thirty years ago, runs on nothing, and gives the same answer every time. So why does the request keep coming? Because AI is visible. It's in every pitch deck, every vendor demo, every board briefing. A rule-based script isn't. It's been sitting quietly in the accounts team's spreadsheet for a decade, and nobody wrote a think-piece about it.

A woman in a grey knit jumper and black scarf tilts her head back with a hardcover book resting face-down over her eyes.
The useful work often looks like nothing much is happening.Valeria Lendel / Unsplash

Two different kinds of tool

In ordinary conversation, we call both of these things "software", and the distinction gets lost. The first kind is a rule-based script. It does exactly what it was told to do, in exactly the same order, every time. You write down the rules once (open this file, check this list, add up these columns, write the result to that spreadsheet), and from then on, the same input produces the same output. A payroll calculation is a script. A VAT return worksheet is a script. Your bank's interest calculation is a script. They're boring on purpose. Boring means you can check them, audit them, and trust them. The second kind is a generative AI model, the thing behind ChatGPT, Claude, Copilot. Instead of following fixed rules, it's been trained on a vast amount of writing and learned to produce plausible, fluent answers to almost anything you ask. That's why it feels like magic. It's also why, if you ask it the same question twice, you can get two different answers. Both good. Both reasonable. Neither guaranteed to match. That variation isn't a fault. It's how the thing works. The trouble starts when we use the second kind to do the job of the first.

Ask an AI to summarise a security report twice, and you get two good summaries. Each one reasonable. Each one subtly different. One emphasises an access-control issue, the other leads with an outdated library. For a one-off read, that's fine. For a check you rerun every quarter and hold yesterday's numbers up against today's, it's a mess. You can't tell whether something has changed in the business, or whether the AI just felt like describing it differently this time. There's a subtler version of the same problem sitting underneath. If you let the AI do the gathering as well as the writing, the gathering is variable too. The AI decides which list of your software to look at. The AI decides how to interpret a fiddly edge case. The AI writes the little bit of code that pulls the data, on the fly, a fresh copy every run. Next quarter, asked the same question, it might make slightly different choices. The answer looks consistent because the AI writes well. The work underneath isn't. This is where the expensive mistakes happen. A team builds the whole thing with a single AI agent. It reports smoothly for a month. Six months in, someone spots a number that can't be right, and there's no paper trail to walk back through. The AI has been quietly filling in gaps the whole time. Nobody can prove what it did, because it didn't do the same thing twice.

Variation is a feature of AI and a bug in a check.

The fix is boring

Write the checking bit as an old-fashioned, rule-based script. Same input, same output, every run. Keep a record of every version, so you can see exactly what it did on any given date. Log what it looked at and what it returned. If it breaks, it breaks loudly, which is the only kind of breaking you want. Then, and only then, hand the output to the AI. Ask it to read the results. Summarise the patterns. Flag what looks concerning. Draft the covering email to the board. That's what AI is genuinely good at: turning a table of numbers into plain language a human can read on a Monday morning. It isn't guessing at what the numbers are. The numbers are in front of it.

A worked example

We're doing a version of this, at a larger scale, with a UK-based educational charity and awarding organisation. They have a lot of processes that they want AI to help run. Marking, moderation, quality checks, supplier reviews, internal reporting. The temptation, quite reasonable from the outside, is to hand each process to an agent and ask it to get on with it. The work we're doing with them sits a step earlier. Before any agent runs anything, we sit with the people who actually do the job and map the process out. A service design exercise crossed with a process audit. Where does this start, what comes in, what gets decided, by whom, using what, and where does the output go next. Step by step. Nothing skipped. Once the process is on paper, each step gets a label. This one is a rule: given these inputs, always produce that output. This one needs an AI, because it's reading free text and summarising it. This one needs a human, because the judgement matters and someone has to sign it. The boring, repeatable steps become scripts. The fluent, interpretive steps become AI calls with carefully chosen models. The judgement steps become review gates.

The agents don't get to invent the process. They follow the map.

Then, and only then, we hand the map to agents. Same steps, same rules, same models, same storage locations, every time. The variation lives inside the AI steps, where it belongs, and the rest of the work is a script you can audit. This is what the marriage of rules and AI actually looks like in practice. It isn't a clever model doing everything. It's a carefully mapped process, with each step using the tool that fits it, and the whole thing visible enough that a human can walk in six months from now and see what happened on any given day.

None of this is anti-AI. It's a question about where the variation belongs. At the top of the work, where the output is a sentence and a human reader gets to push back on it, variation is fine. At the bottom, where the output is a number that feeds into next quarter's decision, variation is a liability. The shape, in practice, is often the opposite of what a client pictures at the start. They picture one clever agent doing the whole thing. What we build is a small boring layer doing most of the work, with clever things sitting on top at the points where cleverness helps. Less to demonstrate. More to trust. That's usually the right shape.

Written by

James Dodd

Founder of moralai. A design led problem solver, with a photojournalism background, who has spent the last decade building software, brands and products for small businesses and the third sector.

Have a question this raised?

Talk to us, not a sales deck.

A short call, no prep needed. We'll level with you on whether there's anything worth doing here.