Issue #28 — Shadow Engineering ·

Your Marketing Manager just asked ChatGPT to write a Python script. It queries your customer database. It works. It’s in production.

And your IT department has no idea it exists.

In Issue #24, I introduced Shadow AI — unsanctioned AI tool use across your organisation. The thesis: your employees are running an unpaid R&D lab. Instead of banning their tools, inventory them, learn from the use cases, and pave a sanctioned path forward.

That framework still holds. But today we move one step further.

OpenAI’s December 2025 Enterprise Report revealed a striking trend: 36% growth in coding messages from non-technical roles over six months. Marketing. HR. Operations. Finance. All writing Python. All writing SQL. All bypassing the engineering function entirely.

Shadow AI is evolving. It’s no longer just about using tools — it’s about creating with them.

Welcome to Shadow Engineering.

The Critical Distinction
#

Shadow AI is consumption. Your sales team drafting emails with ChatGPT. Your legal team summarising contracts. The tool does the work; the human consumes the output. The risks — data leakage, regulatory exposure, lack of audit trails — are containable. An AI Gateway with PII redaction addresses most of them. We covered this in Issue #24.

Shadow Engineering is production. Non-technical staff using AI to write code, build automations, and deploy logic that processes real data. The human doesn’t just consume the output — they deploy it.

This isn’t entirely new. “Under-desk” IT systems existed before GenAI. The change is that now they’re fully democratised.

When you deploy code, you inherit all governance responsibilities of software engineering:

Version control. Where’s the repository? There isn’t one. The script lives in a personal folder or script_final_v2_REAL.py on someone’s desktop.
Security review. Has anyone checked for SQL injection? Hardcoded credentials? No. The creator doesn’t know what those are.
Ownership. When the creator leaves, the knowledge leaves with them.
Maintenance. When a dependency breaks, who fixes it? No one knows it exists.

Shadow AI is a data governance problem. Shadow Engineering is a software engineering problem. The distinction determines which controls actually work.

The AI-Specific Risk Taxonomy
#

Shadow Engineering introduces risks traditional security frameworks weren’t designed to catch.

1. Security: The “Happy Path” Trap

LLMs are probabilistic engines. They predict what comes next based on patterns — including mountains of insecure code from public repositories. The model optimises for “works,” not “safe.”

Research by Veracode found that 45% of AI-generated code contains security flaws. SQL injection. Hardcoded secrets. Missing input validation. These aren’t edge cases. They’re the baseline.

2. Hallucination: The Slopsquatting Problem

Here’s a risk unique to AI-generated code: the model invents dependencies that don’t exist.

Researchers found that approximately 20% of code samples from popular LLMs included hallucinated package names — libraries that sound plausible but aren’t real. Attackers have weaponised this. They register these phantom packages on PyPI and npm, inject them with malware, and wait.

When your Finance Analyst runs pip install on the package ChatGPT suggested, they’re downloading malicious code directly into your environment.

3. Data Leakage: The Ingress Problem

To generate useful code, users must provide context. And context means data.

The Samsung incident is the canonical example. Engineers pasted proprietary source code and meeting notes into ChatGPT to check for errors. That data entered the training corpus. The intellectual property left the building — not through a file transfer your DLP would catch, but through text pasted into a browser window.

Your firewall is watching the wrong door.

4. Orphan Code: The Succession Crisis

A Finance Manager writes a Python script automating weekly reconciliation. Six months later, they leave. The script keeps running — until a source system changes its API. IT is called in. But IT has never seen this script. They cannot fix what they cannot find.

5. Regulatory Exposure: The Compliance Gap

For regulated organisations, Shadow Engineering creates existential compliance risk.

EU AI Act, Article 13 requires AI systems be designed for transparency — deployers must interpret outputs and understand how the system functions. A ChatGPT-generated script that its creator doesn’t understand fails this test by definition.

GDPR, Article 25 demands data protection by design. Scripts that ingest entire datasets because the creator lacks SQL skills to filter at source violate data minimisation principles.

A breach originating from Shadow Engineering code will attract maximum scrutiny — and maximum penalties.

At this point, you might expect me to argue for banning ChatGPT.

I won’t.

AI-powered democratisation delivers genuine value. Speed. Empowerment. Innovation at the edge. The employees closest to business problems can now solve them directly. That’s powerful.

The answer isn’t prohibition. It’s engineering the path of least resistance.

Here’s the uncomfortable truth: your official AI policy is competing with the ease of pasting into ChatGPT. If your sanctioned tools are slower, harder to access, or less capable than the shadow alternatives, you will lose. Every time.

And remember the core principle: “If your AI governance policy can’t automatically fail a build, it’s a suggestion, not a control.”

The Solution Framework
#

How to bring Shadow Engineering into the light — and keep it there:

1. Amnesty — Declare a limited-time amnesty. Invite employees to declare their shadow scripts and automations without penalty. You cannot govern what you cannot see.

2. Pave — Analyse declared tools. Identify common use cases. Then provide sanctioned enterprise tools that fulfil these needs better than the shadow alternatives. If the official path is easier, users will take it voluntarily.

3. Gateway — Deploy an AI Gateway as middleware between users and external models. Centralised control. Observability. PII redaction before prompts reach external models. The Samsung scenario becomes impossible.

4. Zone — Classify by risk. Green zone: personal productivity scripts, loose governance. Yellow zone: department tools, review required. Red zone: enterprise applications, full SDLC, IT oversight mandatory.

5. Intake — After Amnesty closes, new needs will emerge. Create a permanent “front door”: simple form, 48-hour response, route into zones. Register everything — owner, data touched, succession plan. Review quarterly. When the owner leaves, ownership transfers or the tool retires.

The Briefing
#

Karpathy: “I’ve Never Felt This Much Behind”
#

Andrej Karpathy — who coined “vibe coding” — posted this week: “I’ve never felt this much behind as a programmer. The profession is being dramatically refactored.” He describes AI coding tools as “alien tools without a manual” — stochastic, error-prone, yet transformative.

If Karpathy feels behind, consider your Marketing Manager writing Python via ChatGPT. The capability is democratised. The judgement to govern it safely is not. That’s exactly where Shadow Engineering becomes dangerous.

Salesforce’s LLM Reality Check
#

Salesforce is pulling back from heavy reliance on large language models after reliability issues shook executive confidence. “All of us were more confident about large language models a year ago,” admitted SVP Sanjna Parulekar.

The company is pivoting Agentforce toward “deterministic” automation — predictable rules instead of probabilistic outputs. Why? When given more than eight instructions, LLMs start omitting directives. Home security company Vivint found Agentforce sometimes failed to send customer surveys for unexplained reasons.

The irony: Salesforce reportedly reduced support staff from 9,000 to 5,000 through AI agent deployment — then discovered the agents can’t be trusted with complex workflows. The “AI replaces workers” narrative collides with “AI can’t reliably follow instructions.”

Accenture’s “Agentic Strategy” — Blueprint for Bloat?
#

Accenture’s latest report claims companies aligning AI, platform, and business strategies see 2.2x revenue growth and 37% EBITDA lift. They propose a “Platform Agent Hierarchy” — Utility, Super, and Orchestrator agents — to move from “systems of record” to “systems of action.”

What I don’t agree with:

The “Orchestration” Mirage. A three-layer agent hierarchy creates un-auditable black boxes. For a CIO in banking, this isn’t agility — it’s a compliance nightmare.
The “Modernisation First” Trap. The report implies 94% of executives must overhaul their digital core. Classic Big Consulting: “Spend £50M modernising before AI can work.” The fix: Build thin API abstraction layers that treat legacy systems as data sources. Don’t rebuild the core.
The Culture Distraction. Accenture cites “employee resistance” as the top barrier (64%). Employees don’t resist tools that work — they resist AI Sprawl that adds 20 minutes to their workflow. That’s an engineering failure, not a cultural one.

⠀ The signal: Platform vendors are embedding agents natively. Your job isn’t to “architect the future” at once — it’s to ensure your CI/CD pipelines can handle model versioning and drift before agents touch production data.

This Week’s Question
#

Before your next AI governance review, ask your team:

“How many scripts or automations are running in this organisation that weren’t built by IT — and what data do they touch?”

If no one can answer, you’ve found your risk surface.

GenAI has made every employee a potential developer. Shadow Engineering is no longer an edge case — it’s the default. The question is whether you’ll discover it through an audit, or through a breach.

Stay balanced, Krzysztof

The Critical Distinction#

The AI-Specific Risk Taxonomy#

The Solution Framework#

The Briefing#

Karpathy: “I’ve Never Felt This Much Behind”#

Salesforce’s LLM Reality Check#

Accenture’s “Agentic Strategy” — Blueprint for Bloat?#

This Week’s Question#