#12 Open-Source vs. Proprietary Models ·

The choice between using a self-hosted, open-source AI model and paying for a proprietary API appears, on the surface, to be a simple technical and financial decision. One is “free”; the other is not. This is a simplistic view, of course. For a leader in a regulated industry, this is not a choice between software packages. It is a choice between two different models of operational risk, liability, and governance. Choosing an open-source model is a decision to insource operational and reputational liability. Choosing a proprietary model is a decision to convert that liability into a contractual risk that can be (partially) outsourced, for the price of giving up the control of data. Understanding the trade-offs between these two positions is the first step toward a defensible AI strategy. This issue provides a framework for making that decision.

The Briefing
#

As more businesses integrate artificial intelligence into their daily operations, a clearer picture is emerging of the practical challenges and strategic decisions involved. Recent developments highlight several key trends: a move toward building dedicated, secure infrastructure for AI; the growing importance of data location due to new regulations; and a focus on retraining the current workforce rather than replacing it.

One significant trend is the move to build in-house “AI Factories.” Tech giants like Cisco, NVIDIA, and VAST Data are now offering pre-packaged, on-premise systems designed to run powerful AI securely (link). This development is important because it allows businesses to use advanced AI with their own sensitive data inside their own data centres. This approach helps address major security and latency problems, making it easier for companies to move beyond limited cloud-based trials and deploy AI in their core operations.

At the same time, the physical location of a company’s data is becoming a critical strategic consideration. In a direct response to new regulations like the EU AI Act, software company SAP is investing over €20 billion in a “Sovereign Cloud” for Europe. This initiative is designed to ensure all customer data and AI operations remain within the European Union, subject only to EU law. This move shows how the choice of a technology partner is evolving beyond just technical features. For any company operating in Europe, legal and regulatory risk are now central to the calculation.

While infrastructure is changing, so is the understanding of AI’s impact on jobs. A recent report from the New York Fed provides insightsthat challenge the narrative of mass layoffs. The study found that job cuts directly caused by AI are “almost nonexistent,” with only 1% of service firms and zero manufacturing firms reporting such layoffs. Instead of replacing people, companies are focusing heavily on retraining them. The report shows that retraining the existing workforce is the most common strategy, with nearly half of all firms surveyed planning to implement AI training programs in the next six months.

This data suggests a focus on retraining the existing workforce rather than pursuing large-scale replacement. This strategy is supported by economic logic. Research from the National Bureau of Economic Research (NBER) found that AI assistants can boost the productivity of new and lower-skilled workers by as much as 34%, quickly bringing their performance to the level of seasoned experts. For many businesses, it appears more rational to invest in augmenting their current workforce than to replace it. However, the job market is still being reshaped. The New York Fed’s data also showsthat while layoffs are rare, hiring is slowing for some generalist roles while increasing for new positions that require specialized AI skills.

Finally, these technological and workforce shifts are happening in a new geopolitical context. Governments are beginning to view AI capabilities as critical national assets, similar to the power grid or transportation networks. SAP’s sovereign cloud initiative in Europe is mirrored by a new White House action planin the United States, which aims to accelerate AI innovation and build out a robust national AI infrastructure. This indicates that business decisions about how and where to build AI systems are becoming increasingly intertwined with national policies and global strategic competition.

Main Analysis: Deconstructing the Decision
#

The debate on choosing model licensing is often framed around performance benchmarks and features. This is not the complete picture. The correct analysis includes five less visible, but equally important, factors: cost, liability, governance, control, and stability.

1. The Illusion of “Free” The primary appeal of open-source models is the absence of a licence fee. This “free” is an accounting illusion. The cost is not eliminated; it is merely shifted from a vendor invoice to internal budgets. Running an enterprise-grade open-source model requires significant, specialised investment. This includes:

Specialised Headcount: You need a dedicated team for MLOps (Machine Learning Operations), model security, and continuous performance monitoring. These are scarce, expensive engineers. Their cost often exceeds the licence fees of a proprietary equivalent.
Infrastructure: Production-grade models require substantial, persistent computing resources. Managing these GPU clusters, whether on-premise or in the cloud, carries a heavy operational and financial weight.
Incident Recovery: When a self-hosted model fails or produces a serious error, the cost of diagnosis and recovery falls entirely on the internal team. This is an unquantifiable, but potentially very large, financial risk.

⠀A proprietary model, by contrast, presents its costs on a single invoice. The price per token is predictable. The total cost of ownership is clearer, making it easier to budget and manage, but for any larger installation it will be significantly higher than open-source. You should not believe that you can run a commercial model without own team managing it.

2. The Liability Equation: Insource or Outsource? Beyond direct costs lies the critical issue of liability. If a model produces output that is defamatory, breaches copyright, or leads to a discriminatory outcome, who is legally and financially responsible?

With open-source, you are. The moment you download and deploy the model, you inherit 100% of the liability for its output. There is no vendor to call, and no contractual clause to invoke. The risk sits entirely within your organisation.
With proprietary models, you can now transfer a portion of this risk. The market has shifted on this point. Initially, vendors offered models on an “as is” basis. Now, in response to pressure from enterprise customers, legal indemnification for copyright claims has become a competitive battleground.
- The Market Standard: Major providers like Microsoft (for Azure OpenAI Service), Google (for Vertex AI), and OpenAI (for its Enterprise tiers) now offer “copyright shields.” They contractually agree to defend customers and pay the costs of adverse judgments from copyright infringement claims based on the model’s output, provided the service is used as intended.
- The Differentiated Approach: IBM has built its strategy for regulated industries around this concept. For its Granite models on the watsonx platform, indemnification is not a reactive feature but a core design principle. IBM’s argument is that its control over the training data supply chain allows it to stand behind the output with a higher degree of confidence. This positions the product less as a raw tool and more as a defensible, risk-managed service from the ground up.

⠀With commercial model access you are buying a specific, contractually defined risk posture. The premium paid through fees is for a liability shield, with vendors now differentiating on the strength and clarity of that protection. You need to verify the coverage scope, as copyright infringement is but one of many ways using a model puts you in legal risk.

3. The Governance Dilemma: Verifiable Control vs. Purchased Assurance How do you prove to a regulator that your AI system is fair, transparent, and robust? The two approaches offer very different answers, and the risks of proprietary models are significant. Proprietary models operate as “black boxes,” and this creates two distinct forms of risk:

The Technical Risk: You cannot inspect the model’s internal architecture, training data, or weighting. You trust vendor’s legal attestations and third-party audits. You purchase assurance, but you cannot independently verify claims about bias, fairness, or even predictable behaviour under specific conditions.
The Jurisdictional Risk: Using a proprietary model often means sending your data to a cloud provider. Even if the servers are physically located in the EU, a provider headquartered in the US operates under American law. Legislation like the CLOUD Act can potentially give foreign governments access to your data, regardless of its location. This creates a complex and potentially unacceptable risk for any organisation subject to GDPR or other strict data residency rules.

⠀This combination of technical opacity and jurisdictional exposure can make “purchased assurance” a weak position during a stringent regulatory audit. Open-source models offer the opposite. They provide full transparency. Your technical teams can inspect every layer of the model. This enables true “Governance-as-Code,” where you can build direct, auditable, technical checks into the system. You have verifiable control. The trade-off, however, remains stark: this control is meaningless without the internal capability to implement and maintain it. Verifiable control requires a significant investment in governance expertise.

4. The Control Dilemma: Customisation and Management A model’s value is not static; it is realised through its adaptation to specific business contexts and its management over time. Here, the approaches diverge significantly.

Tuning with Proprietary Data: The most powerful customisation is fine-tuning a model on your own proprietary data—customer information, trade secrets, or internal process knowledge.
- With open-source, this is straightforward. You can perform this tuning on-premise or in a private cloud environment, ensuring your most sensitive data never leaves your direct control. The data’s chain of custody is clear and auditable.
- With proprietary models, this is more complex. While major cloud providers now offer “private” or “sandboxed” fine-tuning, you are still sending your data to a third-party environment. The contractual assurances are strong, but the physical control is lost. For data of the highest sensitivity, this may be an unacceptable compromise.
Managing Operational Risks (Bias & Drift): Models degrade. Their performance can “drift” as the real world changes, and inherent biases can become more apparent over time.
- Open-source gives you the direct tools to manage this. Your team can implement bespoke monitoring, precisely measure for bias and drift against your specific business metrics, and intervene directly through retraining or further fine-tuning. This offers the highest degree of control, but it demands a high level of continuous effort and expertise.
- Proprietary models may require you to rely on the vendor’s built-in tools. While increasingly sophisticated, these tools are generic. You are trusting the vendor to detect and flag issues, and your ability to mitigate them is limited to the options the vendor provides. It is a reactive, less granular form of management.

⠀5. Performance vs. Predictability The technology world is fixated on public leaderboards and performance benchmarks. For a regulated firm, chasing the state-of-the-art model is a costly and strategically flawed distraction. The rapid, often chaotic, pace of innovation in open-source is a risk, not a benefit. A model that changes weekly is an unstable foundation for a critical business process. The strategic goal is not the “best” model, but the most stable, predictable, and legally defensible one. Proprietary models, with their slower release cycles and focus on enterprise stability, are often better suited for this purpose. Their perceived weakness—a lack of cutting-edge performance—can be their primary strength in a risk-averse environment.

The Decision Framework: Four Questions to Guide Your Choice
#

To decide, analyse your specific use case against these four areas.

Dimension	Favouring Open-Source	Favouring Proprietary Models
1. Use Case & Data Sensitivity	Low-risk internal tasks (e.g., summarising public documents). Situations where full control over the data path is non-negotiable.	High-risk, customer-facing applications (e.g., financial advice). Use cases where speed-to-market is critical.
2. Internal Capability	You have an existing, expert MLOps and AI security team with a dedicated budget.	You lack specialised AI operational talent or prefer to focus your engineering team on core product development.
3. Risk & Liability Posture	Your organisation has a high-risk tolerance and is prepared to insource all legal and reputational liability.	You operate in a highly litigious area and require contractual risk transfer and vendor indemnification.
4. Governance & Audit Needs	Your regulators demand deep, technical proof of model workings and you have the ability to provide it.	Your compliance requirements can be satisfied by vendor attestations, third-party audits, and contractual assurances.

Concluding Questions
#

An effective AI strategy begins with asking the right questions. Before committing to a path, ensure your team can answer the following:

1 What is the three-year, fully-loaded cost of our chosen model? This must include specialised staff, infrastructure, security, and a budget for incident response, not just licence fees.

2 Who, precisely, is liable if this model produces a harmful output? Have we quantified that risk, and can we demonstrate how we are mitigating it, contractually or operationally?

3 How will we demonstrate control to a regulator? Will we present our own auditable code and logs, or will we present a vendor’s compliance certificate? Is that sufficient?

4 Are we optimising for the right metric? Is our goal to top a public performance benchmark or to achieve a predictable, defensible, and stable business outcome?

Conclusion
#

The choice between open-source and proprietary AI is not a technical detail to be delegated to the IT department. It is a strategic business decision with material consequences for your budget, your risk profile, and your governance posture. By viewing the decision through the lens of cost, liability, control, and governance, you can move beyond the hype and build a strategy that is both effective and defensible.

Until next time, build with foresight.

Krzysztof

The Briefing#

Main Analysis: Deconstructing the Decision#

The Decision Framework: Four Questions to Guide Your Choice#

Concluding Questions#

Conclusion#

The Briefing
#

Main Analysis: Deconstructing the Decision
#

The Decision Framework: Four Questions to Guide Your Choice
#

Concluding Questions
#

Conclusion
#