Know the Risks When Using AI Agents
Learning Objectives
After completing this unit, you’ll be able to:
- Identify common security risks that can affect AI agents.
- Describe how AI agent risks can impact business workflows.
Before You Start
Before you start this module, consider completing this recommended content.
Learn the Risks of Using AI Agents
AI agents can lighten the load on business teams by handling routine work, reducing manual steps, and helping customers get what they need faster. Whether you’re using a single agent or a multi-agent system like Agentforce, the goal is to let the agents handle the work they’re good at so individuals can focus on higher-value tasks.
But because agents can act, not just analyze, they create a different set of risks. Traditional AI risks focus on how data is used or misused. For example, data poisoning, prompt injection, and hallucinations affect data integrity and output quality or what the model says or produces. Agent risks go a step further. They involve what happens when that output drives real actions, such as sending messages, changing records, updating systems, or interacting with customers. When an agent is compromised or misconfigured, the impact moves quickly into operations: a client receives the wrong invoice, a department’s payroll fails to run, or a compliance report is generated but never delivered. These breakdowns show why understanding agent risks and establishing guardrails early matters as much as the technology itself.
Threat modeling provides a way to examine where agent risks could cause errors or missteps. Viewing these risks in the context of everyday work matters because agents operate inside real workflows, touching real data, people, and decisions. When those workflows aren’t known or understood, even well-built agents can behave in ways the development team didn’t expect.
Let’s take a look at a few common AI agent risks, adapted from OWASP’s guidance on agentic AI, and how they can impact workflows.
Risk |
Actor/Cause and Impact |
|---|---|
Memory poisoning |
Attackers seed an agent’s short/long-term memory with false context so the agent repeats unsafe actions over time. |
Tool misuse |
Attackers trick the agent into abusing its authorized tools (send data, run commands, chain actions) within allowed permissions. |
Privilege compromise |
Misconfigurations give the agent broader access than intended, letting it perform high-risk actions (approve, change, delete). |
Resource overload |
Attackers flood tasks so the agent/system runs out of compute/API quota and business processes stall. |
Cascading hallucinations |
Agents reuse their own/other agents’ false outputs, spreading bad decisions across workflows. |
Misaligned/deceptive behaviors |
The agent pursues a goal in disallowed ways (appears compliant, acts harmful), bypassing constraints. |
Repudiation and untraceability |
Attackers disrupt the agent’s logging or overload it with incomplete records, making its actions impossible to trace and preventing proper audits or incident response. |
Identity spoofing and impersonation |
Attackers pose as a user/agent to issue commands and access systems under a trusted identity. |
Overwhelming human-in-the-loop |
Attackers spam reviews/approvals so humans rubber-stamp risky actions from decision fatigue. |
Human attacks on multiagent systems |
Attackers exploit delegation/trust between agents to escalate privileges or bypass checks. |
Any of these issues can lead to operational disruption, lost customer trust, reputational harm, or legal and financial consequences if they aren’t addressed.

Agent risks can emerge from different parts of a workflow—design gaps, configuration issues, unexpected behavior during execution, or intentional misuse by attackers. Regardless of the source, the impact shows up the same way: Agents take actions the business didn’t intend. Threat modeling, which is a process used to identify vulnerabilities and potential threats, brings these weak points into view early so they can be addressed before they affect operations. We explore more about threat modeling in the next unit.
Understand What Motivates Threat Actors
While financial gain is a common driver for cyberattacks, threat actors are motivated by a variety of goals. Understanding the diverse goals of threat actors is essential for effective threat modeling, as it reveals which AI agent vulnerabilities they will target.
Hacktivists
Typically motivated by public exposure, a political agenda, and causing reputational damage. They can leverage agents to manipulate their public-facing actions or communications (for example, customer service or social media agents) to broadcast their message, cause operational disruption, or obstruct business-critical workflows.
Cybercriminals
Typically motivated by financial gain. They can target agents with access to high-value data or financial transaction capabilities. Their goal is to extract sensitive information (data theft), exploit an agent’s permissions to execute fraudulent financial transactions, or hold agent-controlled systems and data for ransom.
Nation-State Actors
Typically motivated by espionage, strategic obstruction, and gaining intellectual property or economic advantage. They can seek to covertly compromise agents that manage intellectual property, critical business logic, or infrastructure control systems. Their attacks are typically subtle and aimed at manipulating or stealing data for a long-term strategic, nonmonetary benefit.
Sum It Up
AI agents make work faster, but that speed is only beneficial if the output is reliable and secure. Recognizing common risks and their potential impact strengthens every interaction and helps keep both agents and the business on steady ground.
In the next unit, we apply threat modeling to real workflows to reveal where agents connect to data, decisions, and people, and to identify the steps needed to protect critical processes and assets in the business.