Know the Risks When Using AI Agents

Learning Objectives

After completing this unit, you’ll be able to:

Identify common security risks that can affect AI agents.
Describe how AI agent risks can impact business workflows.

Before You Start

Before you start this module, consider completing this recommended content.

Learn the Risks of Using AI Agents

AI agents can lighten the load on business teams by handling routine work, reducing manual steps, and helping customers get what they need faster. Whether you’re using a single agent or a multi-agent system like Agentforce, the goal is to let the agents handle the work they’re good at so individuals can focus on higher-value tasks.

But because agents can act, not just analyze, they create a different set of risks. Traditional AI risks focus on how data is used or misused. For example, data poisoning, prompt injection, and hallucinations affect data integrity and output quality or what the model says or produces. Agent risks go a step further. They involve what happens when that output drives real actions, such as sending messages, changing records, updating systems, or interacting with customers. When an agent is compromised or misconfigured, the impact moves quickly into operations: a client receives the wrong invoice, a department’s payroll fails to run, or a compliance report is generated but never delivered. These breakdowns show why understanding agent risks and establishing guardrails early matters as much as the technology itself.

Threat modeling provides a way to examine where agent risks could cause errors or missteps. Viewing these risks in the context of everyday work matters because agents operate inside real workflows, touching real data, people, and decisions. When those workflows aren’t known or understood, even well-built agents can behave in ways the development team didn’t expect.

Let’s take a look at a few common AI agent risks, adapted from OWASP’s guidance on agentic AI, and how they can impact workflows.

Risk	Actor/Cause and Impact
Memory poisoning	Attackers seed an agent’s short/long-term memory with false context so the agent repeats unsafe actions over time.
Tool misuse	Attackers trick the agent into abusing its authorized tools (send data, run commands, chain actions) within allowed permissions.
Privilege compromise	Misconfigurations give the agent broader access than intended, letting it perform high-risk actions (approve, change, delete).
Resource overload	Attackers flood tasks so the agent/system runs out of compute/API quota and business processes stall.
Cascading hallucinations	Agents reuse their own/other agents’ false outputs, spreading bad decisions across workflows.
Misaligned/deceptive behaviors	The agent pursues a goal in disallowed ways (appears compliant, acts harmful), bypassing constraints.
Repudiation and untraceability	Attackers disrupt the agent’s logging or overload it with incomplete records, making its actions impossible to trace and preventing proper audits or incident response.
Identity spoofing and impersonation	Attackers pose as a user/agent to issue commands and access systems under a trusted identity.
Overwhelming human-in-the-loop	Attackers spam reviews/approvals so humans rubber-stamp risky actions from decision fatigue.
Human attacks on multiagent systems	Attackers exploit delegation/trust between agents to escalate privileges or bypass checks.

Any of these issues can lead to operational disruption, lost customer trust, reputational harm, or legal and financial consequences if they aren’t addressed.

An AI agent in the center with icons representing human risks on the left and icons representing system risks on the right.

Agent risks can emerge from different parts of a workflow—design gaps, configuration issues, unexpected behavior during execution, or intentional misuse by attackers. Regardless of the source, the impact shows up the same way: Agents take actions the business didn’t intend. Threat modeling, which is a process used to identify vulnerabilities and potential threats, brings these weak points into view early so they can be addressed before they affect operations. We explore more about threat modeling in the next unit.

Understand What Motivates Threat Actors

While financial gain is a common driver for cyberattacks, threat actors are motivated by a variety of goals. Understanding the diverse goals of threat actors is essential for effective threat modeling, as it reveals which AI agent vulnerabilities they will target.

Hacktivists

Typically motivated by public exposure, a political agenda, and causing reputational damage. They can leverage agents to manipulate their public-facing actions or communications (for example, customer service or social media agents) to broadcast their message, cause operational disruption, or obstruct business-critical workflows.

Cybercriminals

Typically motivated by financial gain. They can target agents with access to high-value data or financial transaction capabilities. Their goal is to extract sensitive information (data theft), exploit an agent’s permissions to execute fraudulent financial transactions, or hold agent-controlled systems and data for ransom.

Nation-State Actors

Typically motivated by espionage, strategic obstruction, and gaining intellectual property or economic advantage. They can seek to covertly compromise agents that manage intellectual property, critical business logic, or infrastructure control systems. Their attacks are typically subtle and aimed at manipulating or stealing data for a long-term strategic, nonmonetary benefit.

Sum It Up

AI agents make work faster, but that speed is only beneficial if the output is reliable and secure. Recognizing common risks and their potential impact strengthens every interaction and helps keep both agents and the business on steady ground.

In the next unit, we apply threat modeling to real workflows to reveal where agents connect to data, decisions, and people, and to identify the steps needed to protect critical processes and assets in the business.

Resources

External Site: Agentic AI–Threats and Mitigations

Time Estimate

Topics

Looking for Help?

Agentforce Resources