From Shadow IT to GhostOps: The Rise of Unauthorized AI Agents in the Enterprise
If you have worked in enterprise IT for long enough, you have lived through the same movie more than once. A new capability arrives, it spreads faster than policy, and the first formal governance conversation happens only after someone asks, “Why is this in our environment?”
In the 2000s, it was consumer cloud storage. Then came unsanctioned collaboration apps, personal file sharing, and “just sign in with your work email.” Shadow IT became the label we used when teams adopted technology outside approved channels, usually with good intent and bad consequences.
Today, we are watching the same pattern repeat, but with far higher stakes, because the technology is not just storing data or sending messages. It is taking action.
Across a number of customers, we are seeing tens, and in some cases hundreds, of AI agents deployed without approval, architecture review, or a clear understanding of where data is stored, how credentials are handled, and what those agents can do when integrated with tools. Microsoft’s Cyber Pulse reporting notes that 29 percent of employees have used unsanctioned AI agents for work tasks. The same Microsoft content also points to rapid adoption at the top end of the market, citing that 80 percent of Fortune 500 organizations use active AI agents.
So, we need a new term, because “Shadow AI” does not capture the operational reality of autonomous or semi-autonomous agents running tasks, pulling data, and calling tools.
I call it GhostOps, unauthorised operational AI agents that materialise inside the enterprise, execute work, and disappear from visibility, leaving only outcomes, and sometimes, incident response teams, to reconstruct what happened.
How GhostOps Differs From Shadow IT
Shadow IT was usually about productivity tooling, storage, and workflow acceleration. GhostOps is about delegated-decision making and delegated execution.
An AI agent is not merely a chat interface. An agent has three properties that change the risk equation:
-
Memory may retain context, files, secrets, prompts, and identifiers over time.
-
Tool use can call APIs, run scripts, interact with SaaS apps, and automate actions.
-
Autonomy can chain steps, retry, escalate, and operate in parallel, often faster than human oversight.
When those three properties are introduced without governance, you do not just lose visibility, you lose control of action and accountability.
Clawd as a Living Example of How GhostOps Spreads
A practical example is Clawd, known in parts of the community through projects like OpenClaw, and the surrounding ecosystem of “skills” and agent tooling that make it simple to connect an agent to real systems. The broader narrative matters less than the pattern. It is open source, it is clever, it is easy to deploy, and it often starts with a developer or power user who is trying to remove friction from daily work.
Open source does not mean unsafe, but it does mean you must treat the supply chain as a first-class risk, especially when the tool is designed to integrate with your messaging platforms, developer workflows, and administrative interfaces. A single “helpful” automation that can create pull requests, access repositories, or call command line tools can become a privileged pathway if not tightly controlled and audited.
And this is the uncomfortable question that governance teams must ask, especially when something is free and open source: Is it really free?
If you are not paying with money, you may be paying with one or more of the following: your data, your metadata, your behavioural signals, your future dependency, or your attention.
Dedicated to hunting and eradicating the world's most challenging threats.
The Risk Catalogue, What Actually Goes Wrong
Below is the risk landscape we are repeatedly mapping in customer environments experiencing GhostOps sprawl. The key point is not that every deployment triggers every risk; the key point is that unsanctioned deployments remove the organization’s ability to know which risks exist at all.
- Data residency and data sovereignty
- Agents often route prompts, files, transcripts, and embeddings to locations that violate internal policy or regulatory expectations.
- Teams frequently cannot answer, where the data is stored, for how long, and under which jurisdiction.
- Credential handling and secret sprawl
- Agents need access to tools, SaaS, repositories, email, ticketing systems, and sometimes, production environments.
- We routinely find long-lived API keys, shared tokens, personal access tokens, and hard-coded secrets placed into config files, environment variables, and chat threads.
- The result is a new class of credential sprawl, agent-driven, difficult to inventory, and often invisible to standard IAM governance.
- Privilege escalation through automation
- An agent that can “open a ticket” can often also “approve a workflow” if the integration is misconfigured.
- When the agent can call admin APIs, it can become an unintentional operator with the blast radius of its delegated permissions.
- Prompt injection and tool abuse
- If an agent ingests untrusted content, emails, web pages, documents, or issue comments, it can be manipulated into calling tools in unsafe ways.
- The real risk is not the model being “tricked,” it is the agent executing actions in a trusted system based on untrusted inputs.
- Supply chain risk in open -source agent ecosystems
- Skill registries, plugin marketplaces, and GitHub dependencies expand the dependency graph dramatically.
- Even reputable projects can be compromised via typosquatting, maintainer account takeover, or malicious dependency updates.
- Loss of auditability and non-repudiation
- Many agent frameworks are not designed for enterprise-grade audit logs, role-based access control, or separation of duties.
- When an incident happens, the organization struggles to answer basic questions: who initiated this, what did it access, what did it change, and where is the evidence.
- Regulatory exposure, legal discoverability, and IP leakage
- Sensitive data may be introduced into external services without approved contractual terms.
- Outputs can embed proprietary context, customer data, or regulated information.
- The evidentiary trail can be fragmented across personal devices, personal accounts, and third-party platforms.
- Cost and consumption risk
- Agent loops, retries, and parallel execution can generate surprise consumption in API calls, compute, and logging.
- The “free” proof-of-concept becomes the most expensive uncontrolled workload in the environment.
Real World Patterns We Are Seeing
To make this concrete, here are examples that recur across enterprise customers, with names and specifics generalised, but the mechanics preserved.
-
The developer productivity agent
A team deploys an agent to manage pull request creation and code review summaries. It is connected to a source control platform, a chat tool, and a CI pipeline. The agent needs a token, the token ends up with broad repository permissions, the token is stored in a plaintext config on a build runner, and now you have a credential and access pathway that bypasses your standard controls. -
The finance reporting agent
An analyst connects an agent to spreadsheets, an ERP export, and a document repository to automate monthly reporting. The agent writes summaries that include sensitive commercial information, then the same agent is used for another task and inadvertently pastes confidential content into an external prompt window. -
The “helpdesk autopilot”
A business unit deploys an agent to draft responses to tickets. Then it gains tool access, it starts creating accounts, resetting passwords, and pushing configuration changes. Without robust approval workflows, you have created an unaudited, semi-autonomous administrator.
Governance Is, Again, Trying to Catch Up with Immersive Technology
This is the repeating lesson of every major technology shift. Governance is often built for systems with known owners, known boundaries, and known change control.
Agents break those assumptions.
They are easy to stand up, can run locally or in the cloud, can be wrapped into chat tools, and can be shared virally across teams. The friction to deploy is low, the friction to govern is high.
The answer is not to ban everything. This method fails because the productivity delta is too large, and the work will move to personal devices, personal accounts, and unmanaged networks.
The answer is to buildguardrails that let the organization move at machine speed, safely.
Guardrails that Work: A Pragmatic Enterprise Approach
At LevelBlue, we typically structure this in two complementary tracks: governance by design and detection by evidence.
Track 1, Governance by Design, Delivered Through PSO
This is where Professional Service Offerings matter, because most organizations need a structured uplift, not a policy memo.
A practical PSO-based program usually includes:
-
Agent inventory and discovery
-
Reference architectures for agent deployment
-
Identity, secrets, and access hardening
-
Data protection and policy enforcement
-
Model and agent risk management
The objective is simple: make the safe path the fast path.
Track 2. Detection by Evidence, Tracked Through SIEM and MXDR
Even with strong guardrails, you will still have GhostOps. That is why observability and detection must be part of the architecture, not an afterthought.
This is where Microsoft Sentinel and the broader Microsoft security suite can be used to build visibility, correlation, and response workflows across identities, endpoints, cloud resources, and SaaS.
A defensible approach typically includes identity telemetry, endpoint monitoring, cloud activity analytics, and data movement detection to surface unauthorised agent behaviour early.
In other words, governance reduces the likelihood, SIEM and MXDR reduce the impact and time to detect.
The Monetization Pivot, from Copilots to Commerce
There is a broader strategic issue emerging in parallel.
As generative AI becomes the front door for search, learning, and decision support, monetization pressure rises. The industry is actively debating advertising in AI experiences, and what it does to trust.
If an AI experience is funded by advertising, then the system is incentivized to influence outcomes, not merely inform them. The most concerning future is advertising shaped by the behavioral signals, documents, conversations, and intent patterns users generate inside AI interfaces.
So, we have to ask the question plainly. Where does conscience end and commerce begin, and where do we start becoming the product?
What to do Next: A Simple Executive Agenda
-
Assume GhostOps already exists in your environment, then measure it.
-
Create a sanctioned agent pathway with reference architectures and fast approvals.
-
Treat agent identities as privileged identities, govern them accordingly.
-
Instrument everything; if it cannot be logged, it cannot be trusted.
-
Use Sentinel and MXDR to correlate identity, endpoint, and cloud signals.
-
Run an executive-sponsored programme via PSO.
Shadow IT was the warning sign of decentralised adoption. GhostOps is the operational reality of decentralised autonomy.
If we get the guardrails right, AI agents will be a compounding advantage. If we get them wrong, they will become the most scalable risk we have ever deployed, one “free” tool at a time.
Don't let GhostOps compromise your enterprise’s security posture. Whether you need to secure your existing AI deployments or build a robust governance framework from the ground up, LevelBlue is ready to help.
Leverage our comprehensive Managed Security Services (MSS) and expert Consulting and Professional Cybersecurity Services to gain visibility, mitigate risk, and ensure your organization moves at machine speed—safely.
References and further reading
About the Author
Grant Hutchons is APAC Director for Managed Security Services Engineering at Trustwave. He specializes in Managed Detection and Response and targeted Co-Managed SOC solutions, helping organizations in healthcare, education, and government sectors enhance their cybersecurity posture. Follow Grant on LinkedIn.
ABOUT LEVELBLUE
LevelBlue is a globally recognized cybersecurity leader that reduces cyber risk and fortifies organizations against disruptive and damaging cyber threats. Our comprehensive offensive and defensive cybersecurity portfolio detects what others cannot, responds with greater speed and effectiveness, optimizes client investment, and improves security resilience. Learn more about us.