Your AI bill isn't the real bill
When an AI silently starts missing things, the invoice is the last place you'll notice. The cost lands somewhere else.
ByJames Dodd
A client testing an AI proof-of-concept tool messaged us about an issue in their trial. A 100-page education document had gone through the converter and come back as 48 pages. Whole sections had simply gone. Not garbled, not flagged, just absent. The earlier tests had all been on smaller documents and had looked fine on first glance. This one was the first long enough to fail visibly.
The job had sounded straightforward when it came in. Convert a large archive of documents, hundreds of pages each, from one format to another. The obvious approach was to hand the whole lot to a model and ask it for the converted version. We tried it. It looked like it was working.
It wasn't, on the longer documents, and on the awkward ones (odd layouts, tricky tables, long appendices). On those, the output would quietly drop chunks of the source, or compress several pages into a line or two. No error. No warning. You only caught it by opening a file and reading what came back.
What happens here has a name in the literature, and a useful analogy outside it. When you feed a model more than it can comfortably hold in mind, the quality of its answers starts to slip. It misses bits. It contradicts itself. It hedges where it shouldn't. The analogy is rust. It builds up wherever metal sits exposed to moisture and air, hidden under paint, eating into the surface long before anything looks wrong. You can prevent it with care. Once it has set in, you don't get the metal back: you cut out what it has eaten and replace it.
The same goes here. It accumulates with exposure (long sessions, heavy inputs, prompts that have grown over time). You can prevent it with the same kind of care. But by the time it shows up in the output, the damage has already been done.
For a business, the interesting question is what that damage costs. Not the bill from the AI provider, which barely notices. The places that don't show up on an invoice at all.
The intuitive response is to cut and optimise. Smaller prompt, cheaper model, shorter conversation, lower bill. We tried that first. It made things worse.
The same documents had to be re-run two or three times before the output was usable. Each retry cost money. It also cost time nobody had budgeted for. Someone on the client's side was reading every converted file against the original and patching the holes by hand. By the time you added that up, the AI route was taking longer than doing the job manually would have. And because the failures were silent, the work ran on hit-and-hope: you'd ship a batch, then find out later whether the documents in it had actually come through whole.
That was the real bill. The charge from the AI provider was the smallest line on it.
So we did the opposite. We broke each document into smaller pieces, ran them one at a time, and stitched the results back together. The per-run cost went up, because we were doing more processing per file. But there were no retries. The output was clean enough that nobody had to fix it afterwards. The client could forecast the cost of processing a new document within a narrow range, because the variance had collapsed.
Predictable beat cheap, by a wide margin.
The bill is the last place this kind of decay shows up. You see it first in the retries, and in the hours your people spend cleaning output that should have been right the first time. If your AI spend is moving and the invoice doesn't explain it, look there first. By the time it shows on the bill, the rust has been at work for weeks.
Written by
James Dodd
Founder of moralai.co. A design led problem solver, with a photojournalism background, who has spent the last decade building software, brands and products for small businesses and the third sector.
Have a question this raised?
Talk to us, not a sales deck.
A short call, no prep needed. We'll level with you on whether there's anything worth doing here.