Skip to main content

Identify and Manage AI Agent Risks

Learning Objectives

After completing this unit, you’ll be able to:

  • Identify where agents interact with data, decisions, and people in your workflows.
  • Practice threat modeling to mitigate AI agent risks.

Before You Start

Before you start this module, consider completing this recommended content.

Threat Modeling Business Workflows

Threat modeling is a structured way to look at how your systems work, identify what can go wrong, and decide how to manage those risks before they cause problems. In this unit, you use a simplified version of threat modeling to trace where agents fit into your business processes and where risk can sneak in.

Before diving into code-level threat modeling, it helps to understand how AI agents fit into the larger business workflow. STRIDE (short for spoofing, tampering, repudiation, information disclosure, denial of service, and elevation of privilege) is a familiar threat modeling framework to most developers, who use it to identify technical security threats within applications. But even the best STRIDE analysis won’t catch all the risks that come from how agents are used.

When developers and security professionals understand the business workflow, the threat model becomes sharper. They can see which steps matter most and where an agent’s decisions ripple outward and inward. They start noticing risks that don’t show up in an Integrated Development Environment (IDE). Issues like missing approvals, silent failures, or unclear hand-offs won’t appear in a code editor. But they surface quickly when the workflow itself is modeled and examined.

Threat modeling at the workflow level reveals where controls are weak, where agents rely on assumptions, and where a small gap can turn into a business-impacting issue.

Before mapping your own workflow, take a look at this example and see how a single missed step can break an entire process, even when every part of the system appears to be working.

Scenario: The Vanishing Report

Before mapping your own workflow, let’s warm up with a quick challenge scenario.

Your company’s quarterly compliance report is missing. The system shows it was generated but never sent to the regulator. The automated agent responsible for collecting, reviewing, and emailing the report insists it “completed all steps successfully.” Here’s the simplified workflow.

Step

Who/What Acts

Description

1. Data is gathered from finance systems.

AI agent

Pulls data from multiple databases.

2. Report is generated.

AI agent

Compiles data and formats the report.

3. Report is reviewed.

Human

Checks totals and signs off.

4. Report is emailed.

AI agent

Sends final copy to the regulator.

Your Task

Look at each step and note where an agent is involved, then ask yourself:

  • Where could the report have failed to move to the next step (not saved, not handed off, or not sent)?
  • What control or check would have caught that failure sooner?
  • Who should have been notified when the report wasn’t sent?

You don’t need to write a full response, just think through what might have caused the failure and where you’d look first. This exercise shows how small breakdowns (missed check or unclear ownership) can ripple through a workflow.

Answer Key: The Vanishing Report

There’s more than one possible cause, but here are a few likely causes and lessons.

Step

What Might Have Gone Wrong

What Could Have Prevented It

1. Data is gathered from finance systems.

The agent may never have started the data-gathering step, so the workflow didn’t continue toward report creation.

Add a trigger check that confirms the process has started and that initial data collection completed successfully before moving on.

2. Report is generated.

The agent created the report but didn’t store it in the correct folder or mark it for sending.

Add an automatic check to confirm the report is saved, properly named, and ready for delivery.

3. Report is reviewed.

The human reviewer approved the file but didn’t confirm that the send action was triggered.

Include a review checklist or dashboard that shows whether delivery is pending, in progress, or complete.

4. Report is emailed.

The agent tried to send the email, but the action failed or permissions had expired, and no alert was raised.

Add delivery confirmation and notification steps so the reviewer knows when the report is successfully sent.

This scenario demonstrates how agent risks can show up in real workflows: missing checks, unclear hand-offs, and unreported failures. These issues are most visible to developers, system administrators, cybersecurity professionals and those responsible for configuring and maintaining the workflow. Examining these touchpoints early helps prevent small gaps from turning into business-impacting failures.

Try It with Your Own Workflow

Now, the same lens will be applied to an internal workflow. The goal is to see where agents connect with people, data, and systems and where risks are most likely to appear. Follow these four steps to map the workflow, identify potential risk points, and consider how they can be managed.

Step 1: Map Your Workflow

  • Start by choosing one process where agents are active (customer support, scheduling, onboarding).
  • Identify the steps from start to finish.
  • Note who or what triggers the process, what data moves through it, and where the agent fits in.

Example: Agent Workflow for Customer Support

Customer Request → Agent Reviews Data → Agent Responds → Human Approval → System Update

Step 2: Mark the Interactions

  • Next, look for the points where your agent interacts with:
    • People (users, employees, customers)
    • Data (information it reads, writes, or stores)
    • Systems (apps, APIs, databases)
  • Highlight these touchpoints. It’s where risks are most likely to appear.

Example: Customer Request (Data entry) → Agent Reviews Data (API connection) → Agent Responds → Human Approval (Human review) → System Update

Step 3: Apply a Threat-Modeling Lens

Now ask yourself a few simple questions at each touchpoint.

  • What could go wrong here?
  • What happens if the agent makes the wrong decision or acts too early or too late?
  • Who or what could exploit this step?

Create a simple table to document where the biggest issues might appear. Here is an abbreviated example.

Step

Possible Risk

Impact

Customer request

Missing or incomplete data submitted by the customer

Agent responds incorrectly or cannot complete the task.

Agent reviews data

Excessive API access

Sensitive data is exposed.

Step 4: Plan Your Response

  • For each risk you noted, decide how to handle it.
    • Fix it by tightening permissions, add review steps, or limit what the agent can access.
    • Monitor it by adding alerts, logs, or regular reviews.
    • Accept it by documenting low-impact risks so you can revisit them later.
  • Pick your top three risks and one concrete response for each.

Example Risk Analysis

Step

Possible Risk

Response

Impact

Customer request

Missing or incomplete data submitted by the customer

Fix: Add required fields or validation checks so incomplete requests can’t be submitted.

The agent receives complete, accurate information and can respond correctly.

Agent reviews data

Excessive API access

Monitor: Add alerts if the agent attempts to access data outside approved systems.

Security analysts detect unusual or unsafe data access before it becomes a bigger problem.

Sum It Up

In this unit, you mapped a workflow, identified where agents interact with people, data, and systems, and used a simplified threat-modeling lens to uncover potential risks. Keep this workflow view in mind as you move into formal threat-modeling frameworks like STRIDE and others. They can help you identify the deeper technical threats, but the workflow context ensures those efforts stay grounded in what matters most—keeping AI agents aligned with the mission and outcomes of the business.

Share your Trailhead feedback over on Salesforce Help.

We'd love to hear about your experience with Trailhead - you can now access the new feedback form anytime from the Salesforce Help site.

Learn More Continue to Share Feedback