95% of AI pilots fail. Not because the technology doesn’t work — the reasons are many. One of them is that nobody measured whether it was working. 2026 won’t bring a crash. It’ll bring a simple question from finance: “Show me the number, or lose the budget.”
In Issue #28, I described Shadow Engineering — code written outside IT’s view, ungoverned and unmeasured. This issue addresses its twin problem: AI deployed without any way to prove it works. Both stem from the same root cause: organisations building faster than they can govern.
Three Illusions That Disappeared in 2025#
1. Usage ≠ Value#
Some organisations measure indicators such as “agent count” — success means each employee built at least one agent. MAU, prompts, number of PoCs. None of these appear on a P&L. Usage tells you people clicked buttons.
2. “Productivity Gains” That Vanished#
“Employees save 5 hours per week.”
But where do those hours go? Usually nowhere. No quota increase, no headcount change, no reallocation to higher-value work. Such productivity gains are virtual — lost in the gap between what the vendor promised and what shows up in finance. Unless someone harvests those hours — through capacity increase, new capabilities, or explicit reallocation — they never existed.
3. “Strategic Value”#
The last refuge of projects that can’t prove ROI.
When a CFO asks “What’s the return?” and the answer is “strategic value,” what they hear is: “We don’t know.”
What CFOs Actually Want to Know#
Only 23% of organisations can accurately measure AI ROI. The rest are guessing.
Traditional AI KPIs — accuracy, latency, NPS — are dangerous without P&L context. A chatbot can have excellent technical metrics while increasing support costs because it deflects issues rather than resolving them.
What does a unit of work cost — model vs. human?
Contact centre example:
- Human agent: $3–6 per issue resolution
- AI agent: $0.25–0.50 per resolution That survives board scrutiny. “Strategic alignment” doesn’t.
A Scorecard Worth Building#
To measure the actual business value of AI automation, five metrics is enough:
1. % of process volume handled by AI (vs. human) — not adoption rate, actual work displacement.
2. Net cost per transaction — full TCO including integration, licensing, and the humans still needed around the model. Not just token costs.
3. Capacity change — can the organisation handle more work without adding headcount? This is where productivity claims become real or get exposed.
4. Risk exposure change — incidents, complaints, compliance breaches. AI can reduce risk or amplify it. You need to know which.
5. Time-to-value — pilot to first measurable P&L impact. If this number is “unknown” or “18+ months,” you don’t have a business initiative.
The wins that matter aren’t chatbot sophistication. In finance, reconciliationis a better example — 80% manual effort reduction, $10M saved. Boring. Measurable. Defensible.
What’s Coming in 2026#
Governance gets real#
The deadline for the next AI Act regulations: August 2026. High-risk systems - recruitment, credit scoring, insurance - must comply. Penalty: 7% of global turnover or €35M. Some EU members are trying to push this deadline (see The Briefing), so don’t consider it set in stone.
Auditors will change their questions. Not “did you document your AI policy?” but “can you explain why this model made this decision?”
Without governance-as-code, compliance will kill projects before technology or weak business case does.
Pilot Purgatory ends#
Experimentation budgets are shrinking. CFOs want binary: kill or scale.
Projects without clear production paths face the kill switch:
- No baseline for measuring productivity gains? Kill.
- Running >6 months without production plan? Kill.
- AI cost >50% of human cost it replaces? Kill.
- Can’t pass governance? Kill. Only 39% of organisations report any EBIT impact from AI at enterprise level. Of those, most say it’s less than 5% of EBIT. Just 6% — McKinsey’s “high performers” — attribute more than 5% of EBIT to AI.
Some of that “AI-driven EBIT impact” is plain cost-cutting — layoffs, vendor consolidation, budget freezes — retroactively tagged as “AI transformation” because it sounds better in investor calls. The real AI contribution may be smaller still.
Shadow AI demands an answer#
68% of employees (or 75%, depending on source of research) are using AI tools IT hasn’t sanctioned — and a significant portion (some say 15% of all prompts) are sharing sensitive corporate data. Two choices:
- Cut everything unsanctioned. Lose whatever innovation came with it. Be sure people will keep circumventing your rules, because the new tools help them.
- Or: abolition. Surface what exists, inventory the use cases, build solutions that provide control without killing what’s working. Shadow AI is either the biggest risk or a free R&D lab that wasn’t budgeted for. The difference is whether it gets governed.
What To Do In Q1#
Three decisions you can’t postpone:
- Pick 3 AI ROI metrics that reach the board.
- Write down kill criteria for existing PoCs. What dies, what scales, what freezes.
- Name someone responsible for AI production and governance. Not innovation — production and risk. Before signing the next AI contract, ask:
- Where exactly does P&L impact show up, and when?
- How will success be measured without referencing “usage” or “strategic value”?
- Who owns this after the project ends — technically and operationally?
The Briefing#
Hinton: “We’re Not Going to Stop It Just for a Few Lives”#
Geoffrey Hinton, Nobel laureate and widely called the “godfather of AI,” appeared on CNN’sState of the Union last week. His assessment: he’s more worried than when he quit Google two years ago.
The technology has advanced faster than expected, particularly in reasoning and what Hinton calls “deception” — AI systems making plans to avoid being shut down. He puts the probability of AI “taking over” at 10–20%.
On regulation: Hinton called the Trump administration’s push against AI oversight “crazy.” On Big Tech motivation: “I suspect they think, well, there’s a lot of money to be made here. We’re not going to stop it just for a few lives.”
BCG: “Targets Over Tools”#
BCG’s latest on AI transformation governance lands squarely on this issue’s thesis: boards must demand “outcome flight paths” — transparent dashboards that make AI progress as visible as cost or risk. Their mantra: “impact before technology, targets before tools, discipline before hype.”
The consultants identify a common failure mode: management teams treating AI as a technical experiment rather than a results-delivery tool tied to P&L. Their fix: start with a zero-based question - “If we rebuilt this process from scratch today, what would perfect look like?” - then design backward to quarterly milestones.
The governance advice is specific: every AI initiative should deliver lead indicators of enterprise value (productivity gains, cycle-time reductions, cost takeout), with intervention when metrics drift. Boards should ask: “Which core processes are being redesigned end-to-end, and how will that translate into measurable business outcomes?”
The consultants are admitting the vision-only phase is over.
France Joins Germany: Pause the AI Act?#
The August 2026 deadline for high-risk AI systems (recruitment, credit scoring, insurance) remains technically in place. But with France, Germany, Sweden, and Czechia pushing back, and the Commission preparing its “Digital Omnibus” simplification package, the political ground is shifting.
France has publicly aligned with Germany’s call to delay enforcement of high-risk provisions under the EU AI Act. French digital minister Anne Le Hénanff argued at a Berlin summit that companies need more time to adapt.
Spain and the Netherlands oppose delay, arguing it would weaken safeguards before they operate.
For enterprise leaders: plan for the current deadline, but watch Brussels closely. Either way, firms that prepared early are better positioned.
This Week’s Question#
“What percentage of our AI projects can show P&L impact today — not projected, not ‘strategic,’ but actual numbers that would survive a CFO challenge?”
If the answer is “almost none” or — even worse — ”we don’t know”, you’ve found your measurement gap.
You don’t need another report about trillions unlocked by AI. You need three numbers you can defend to finance.
The firms that win 2026 won’t have the best models. They’ll be the ones who measured clearly, governed early, and killed what didn’t work.
Until next time,
Krzysztof
