Most leaders view AI Governance it as a defensive necessity: a compliance hurdle and a shield against regulatory fines. My argument is that in an environment of AI hype and anxiety, governance is not only a shield — it can be made into a commercial weapon. You can promise your customers ambiguous “smart” features, which increasingly cause distrust, but a demonstrable governance framework allows you to sell something far more valuable: predictability. Customers, particularly in regulated markets, are not buying AI. They are buying reliable outcomes. They will pay a premium for a bank that can prove its mortgage algorithm is fair, an insurer whose automated claims process is transparent, and a telco whose personalisation engine is not “creepy.” This is not a theoretical advantage — the trust deficit is a commercial opportunity. Demonstrable governance is the tool to capture it.
The Briefing#
Decade of agents#
This week we’re coming back to Andrej Karpathy, this time from an interview he gave recently. One of the topics discussed is „ghosts vs animals“ we discussed recently, based on his blog post, but Karpathy has raised several other topics, one of which I find especially relevant for the enterprise applications.
The “Decade of Agents”: A Marathon, Not a Sprint The notion of autonomous AI agents acting as digital employees is often overstated. Karpathy contends that this will be a “decade of agents,” not a “year.” He points to significant “cognitive deficits” in current models: a lack of continual learning (they reset their “working memory” with each interaction), insufficient multimodality (difficulty integrating diverse data types like vision and sound), and limited ability to use external tools effectively. Building agents capable of reliably performing complex, real-world tasks is a substantial engineering challenge. Hype around “AI agents” can mislead investment decisions — enterprise strategies for agentic AI should be measured and incremental. We are still very early in the technology development cycle, and there still are many structural challenges. The “March of Nines”: The Cost of Reliability Achieving high reliability in AI systems, especially those operating in critical or regulated contexts, is a painstaking process. Karpathy terms this the “march of nines” – each additional ’nine’ of reliability (e.g., from 99% to 99.9%) demands a disproportionate amount of engineering effort. A compelling demo, operating at 90% accuracy, is vastly different from a production system requiring 99.999% reliability. This is particularly true in domains like self-driving cars or mission-critical software, where the cost of failure is catastrophic. We need to carefully choose which processes to automate, redesign them properly, ensuring robust MLOps practices, comprehensive testing frameworks, and human-in-the-loop monitoring.
Data Readiness as the Main Barrier to Scaling Enterprise AI#
A 2025 study from Qlik underscores that while enterprise AI budgets are surging, the primary obstacle to scaling AI initiatives is a lack of data readiness. This is corroborated by Anthropic’s research, which notes that costly data modernisation is a significant bottleneck to high-impact AI adoption. This confirms that the race for AI advantage is not won by having the most advanced model, but by mastering the unglamorous work of data plumbing; competitive advantage lies in having the cleanest, most accessible, and best-governed data, not the fanciest algorithm.
From Compliance Cost to Commercial Asset#
The Market for Trust: Why Your Customers Will Pay for Predictability#
The default consumer position is distrust. Recent data shows 53% of consumers are wary of AI-powered results. Trust in businesses to use AI ethically has fallen to 42%. Furthermore, most customers insist it is important to know when they are communicating with an AI (also this is mandated by EU law). This scepticism is not a problem, it is a new market. It creates a clear demand for products and services from enterprises that can prove their AI is under control. Communicating this control should not be “ethics-washing.” It is a factual articulation of risk management. An enterprise does not market “We are ethical.” It markets:
Transparency: “We use AI for two purposes: fraud detection and client portfolio alerts. Here is a public report on how these systems are monitored and a ‘human-in-the-loop’ is engaged for all critical decisions.”
Fairness: “Our credit-scoring models are tested quarterly by an independent auditor to ensure they produce no demographically biassed outcomes. Here is the summary.”
Reliability: “Our wealth management assistant is trained on a closed data set of our internal market analysis, not the open internet. Its answers are verifiably accurate.” This language, grounded in auditable facts, moves the conversation from abstract ethics to concrete reliability. ⠀
⠀
The Decisive Factor: Can Demonstrable Governance Win Customers?#
Enterprises that embed governance into their AI can build a superior product. This superiority is measured in customer retention and revenue. AI-powered “next best experience” engines, when governed, can increase revenue by 5-8% (McKinsey). The same study noted a 210% improvement in targeting at-risk customers, leading to a 59% reduction in churn for that high-value group. This performance is impossible without governance. An ungoverned personalisation engine may generate erratic, biassed, or irrelevant offers that increase churn. The governance is what makes the tool reliable, and reliability is supports retaining the customer. The cost of failure is materialisation of reputational risk — the most cited AI concern among S&P 500 companies (Harvard Law). A single AI-driven failure—a biassed lending decision, a privacy breach, a catastrophic “hallucination” given to a client—cascades into immediate customer attrition. The winner, therefore, is not the company with the “smartest” AI. It is the company whose AI is so reliably governed that the customer never has to question its output.
Process Over Promises: Engineering Predictability#
The excitement surrounding Large Language Models (LLMs) often obscures a fundamental truth: they are inherently unreliable. Their tendency to produce confidently incorrect output—is not a bug that can be patched, it is a core characteristic of the technology. For an enterprise, a hallucination is not a technical quirk; it is an unguided missile of reputational risk. The only effective countermeasure is a return to first principles: rigorous, ‘old-school’ process design. The value is not in the model, but in the quality of the engineering architecture that contains it. This means treating the LLM as an untrusted, probabilistic component that must be managed. Effective management is an engineering task:
Grounding: Forbid the model from accessing the open internet. Force it to generate answers based on a curated, verifiable internal knowledge base (a technique known as Retrieval-Augmented Generation). This drastically limits its ability to invent facts.
Guardrails: Implement strict input validation and output filtering. If a customer asks a question outside a predefined scope, the system should escalate to a human, not attempt to generate a novel answer.
Human-in-the-Loop: For any high-stakes process—financial advice, medical information, contract analysis—the LLM’s role is to assist a human, not replace them. The final decision must be made by an accountable person. The ‘magic’ of the AI is a distraction. The competitive advantage lies in the discipline of the process that governs it.
⠀
The Sales Team’s New Weapon: A Playbook for Selling Trust#
Your governance framework can be a sales asset. Your risk and compliance teams have already built the product; your sales and marketing teams must learn how to sell it. This requires a playbook based on artifacts, not adjectives.
The Transparency Report: This should be your primary marketing document. A simple, public-facing summary of what AI systems you use, why you use them, and how you govern them. It is the first link you send a risk-averse prospect.
The New Metric: NPS is the right tool to measure customer advocacy but not “algorithmic trust” (as suggested by NTT DATA). If AI is an important part of your customer touchpoints, start measuring customer perception of your AI’s fairness, explainability, and reliability. When your wealth management team can tell a high-net-worth prospect, “Our client algorithmic trust score for our advisory tool is 9.1/10,” they present an auditable fact that a competitor cannot invent.
The Audit-as-Proof: For major corporate clients, a sales team’s ability to produce a summary of a recent bias audit or an example of an AI decision-log is the ultimate differentiator. It ends the “trust me” conversation and replaces it with “verify me.” Automated governance platforms are already reducing audit costs by 57% (Speednet), making this proof commercially efficient to supply.
⠀
Questions for Your Leadership Team#
What is our “verification tax”? How many employee-hours are spent weekly checking, correcting, or apologising for the outputs of our AI systems?
How is our marketing team articulating our control over AI, rather than just its features? Where is our public-facing Transparency Report?
When we deploy a new AI tool, particularly an LLM, is our primary investment in the model’s features or in the engineering of its operational guardrails and human oversight processes?
How is our sales team using our governance posture as a competitive advantage to win sceptical, high-value customers? Are we measuring algorithmic trust?
⠀
Conclusion#
Customers are not afraid of artificial intelligence. They are afraid of uncontrolled artificial intelligence. In a market where the default is distrust, proof of control is the definitive commercial advantage. Trust is not a soft feeling. It is an engineered product, built from governance-as-code, rigorous audits, and transparent reporting. It is now the most valuable product you sell.
Until next time, build with foresight.
Krzysztof
